postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	0bb51aa967	Improve parsetree representation of special functions such as CURRENT_DATE. We implement a dozen or so parameterless functions that the SQL standard defines special syntax for. Up to now, that was done by converting them into more or less ad-hoc constructs such as "'now'::text::date". That's messy for multiple reasons: it exposes what should be implementation details to users, and performance is worse than it needs to be in several cases. To improve matters, invent a new expression node type SQLValueFunction that can represent any of these parameterless functions. Bump catversion because this changes stored parsetrees for rules. Discussion: <30058.1463091294@sss.pgh.pa.us>	2016-08-16 20:33:01 -04:00
Tom Lane	ed0097e4f9	Add SQL-accessible functions for inspecting index AM properties. Per discussion, we should provide such functions to replace the lost ability to discover AM properties by inspecting pg_am (cf commit `65c5fcd35`). The added functionality is also meant to displace any code that was looking directly at pg_index.indoption, since we'd rather not believe that the bit meanings in that field are part of any client API contract. As future-proofing, define the SQL API to not assume that properties that are currently AM-wide or index-wide will remain so unless they logically must be; instead, expose them only when inquiring about a specific index or even specific index column. Also provide the ability for an index AM to override the behavior. In passing, document pg_am.amtype, overlooked in commit `473b93287`. Andrew Gierth, with kibitzing by me and others Discussion: <87mvl5on7n.fsf@news-spur.riddles.org.uk>	2016-08-13 18:31:14 -04:00
Tom Lane	95bee941be	Fix misestimation of n_distinct for a nearly-unique column with many nulls. If ANALYZE found no repeated non-null entries in its sample, it set the column's stadistinct value to -1.0, intending to indicate that the entries are all distinct. But what this value actually means is that the number of distinct values is 100% of the table's rowcount, and thus it was overestimating the number of distinct values by however many nulls there are. This could lead to very poor selectivity estimates, as for example in a recent report from Andreas Joseph Krogh. We should discount the stadistinct value by whatever we've estimated the nulls fraction to be. (That is what will happen if we choose to use a negative stadistinct for a column that does have repeated entries, so this code path was just inconsistent.) In addition to fixing the stadistinct entries stored by several different ANALYZE code paths, adjust the logic where get_variable_numdistinct() forces an "all distinct" estimate on the basis of finding a relevant unique index. Unique indexes don't reject nulls, so there's no reason to assume that the null fraction doesn't apply. Back-patch to all supported branches. Back-patching is a bit of a judgment call, but this problem seems to affect only a few users (else we'd have identified it long ago), and it's bad enough when it does happen that destabilizing plan choices in a worse direction seems unlikely. Patch by me, with documentation wording suggested by Dean Rasheed Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena> Discussion: <16143.1470350371@sss.pgh.pa.us>	2016-08-07 18:52:02 -04:00
Tom Lane	a3c7a993d5	Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>	2016-08-03 16:37:03 -04:00
Fujii Masao	dd5eb805d5	Remove unused arguments from pg_replication_origin_xact_reset function. The document specifies that pg_replication_origin_xact_reset function doesn't have any argument variables. But previously it was actually defined so as to have two argument variables, though they were not used at all. That is, the pg_proc entry for that function was incorrect. This patch fixes the pg_proc entry and removes those two arguments from the function definition. No back-patch because this change needs a catalog version bump although the issue exists in 9.5 as well. Instead, a note about those unused argument variables will be added to 9.5 document later. Catalog version bumped due to the change of pg_proc.	2016-08-02 02:43:17 +09:00
Tom Lane	99dd8b05aa	Advance PG_CONTROL_VERSION. This should have been done in commit `73c986adde` which added several new fields to pg_control, and again in commit `5028f22f6e` which changed the CRC algorithm, but it wasn't. It's far too late to fix it in the 9.5 branch, but let's do so in 9.6, so that if a 9.6 postmaster is started against a 9.4-era pg_control it will complain about a versioning problem rather than a CRC failure. We already forced initdb/pg_upgrade for beta3, so there's no downside to doing this now. Discussion: <7615.1468598094@sss.pgh.pa.us>	2016-07-16 12:49:14 -04:00
Fujii Masao	60d50769b7	Rename pg_stat_wal_receiver.conn_info to conninfo. Per discussion on pgsql-hackers, conninfo is better as the column name because it's more commonly used in PostgreSQL. Catalog version bumped due to the change of pg_proc. Author: Michael Paquier	2016-07-07 12:59:39 +09:00
Alvaro Herrera	9ed551e0a4	Add conninfo to pg_stat_wal_receiver Commit `b1a9bad9e7` introduced a stats view to provide insight into the running WAL receiver, but neglected to include the connection string in it, as reported by Michaël Paquier. This commit fixes that omission. (Any security-sensitive information is not disclosed). While at it, close the mild security hole that we were exposing the password in the connection string in shared memory. This isn't user-accessible, but it still looks like a good idea to avoid having the cleartext password in memory. Author: Michaël Paquier, Álvaro Herrera Review by: Vik Fearing Discussion: https://www.postgresql.org/message-id/CAB7nPqStg4M561obo7ryZ5G+fUydG4v1Ajs1xZT1ujtu+woRag@mail.gmail.com	2016-06-29 16:57:17 -04:00
Tom Lane	19e972d558	Rethink node-level representation of partial-aggregation modes. The original coding had three separate booleans representing partial aggregation behavior, which was confusing, unreadable, and error-prone, not least because the booleans weren't always listed in the same order. It was also inadequate for the allegedly-desirable future extension to support intermediate partial aggregation, because we'd need separate markers for serialization and deserialization in such a case. Merge these bools into an enum "AggSplit" to provide symbolic names for the supported operating modes (and document what those are). By assigning the values of the enum constants carefully, we can treat AggSplit values as options bitmasks so that tests of what to do aren't noticeably more expensive than before. While at it, get rid of Aggref.aggoutputtype. That's not needed since commit `59a3795c2` got rid of setrefs.c's special-purpose Aggref comparison code, and it likewise seemed more confusing than helpful. Assorted comment cleanup as well (there's still more that I want to do in that line). catversion bump for change in Aggref node contents. Should be the last one for partial-aggregation changes. Discussion: <29309.1466699160@sss.pgh.pa.us>	2016-06-26 14:33:38 -04:00
Tom Lane	f8ace5477e	Fix type-safety problem with parallel aggregate serial/deserialization. The original specification for this called for the deserialization function to have signature "deserialize(serialtype) returns transtype", which is a security violation if transtype is INTERNAL (which it always would be in practice) and serialtype is not (which ditto). The patch blithely overrode the opr_sanity check for that, which was sloppy-enough work in itself, but the indisputable reason this cannot be allowed to stand is that CREATE FUNCTION will reject such a signature and thus it'd be impossible for extensions to create parallelizable aggregates. The minimum fix to make the signature type-safe is to add a second, dummy argument of type INTERNAL. But to lock it down a bit more and make misuse of INTERNAL-accepting functions less likely, let's get rid of the ability to specify a "serialtype" for an aggregate and just say that the only useful serialtype is BYTEA --- which, in practice, is the only interesting value anyway, due to the usefulness of the send/recv infrastructure for this purpose. That means we only have to allow "serialize(internal) returns bytea" and "deserialize(bytea, internal) returns internal" as the signatures for these support functions. In passing fix bogus signature of int4_avg_combine, which I found thanks to adding an opr_sanity check on combinefunc signatures. catversion bump due to removing pg_aggregate.aggserialtype and adjusting signatures of assorted built-in functions. David Rowley and Tom Lane Discussion: <27247.1466185504@sss.pgh.pa.us>	2016-06-22 16:52:41 -04:00
Tom Lane	915b703e16	Fix handling of argument and result datatypes for partial aggregation. When doing partial aggregation, the args list of the upper (combining) Aggref node is replaced by a Var representing the output of the partial aggregation steps, which has either the aggregate's transition data type or a serialized representation of that. However, nodeAgg.c blindly continued to use the args list as an indication of the user-level argument types. This broke resolution of polymorphic transition datatypes at executor startup (though it accidentally failed to fail for the ANYARRAY case, which is likely the only one anyone had tested). Moreover, the constructed FuncExpr passed to the finalfunc contained completely wrong information, which would have led to bogus answers or crashes for any case where the finalfunc examined that information (which is only likely to be with polymorphic aggregates using a non-polymorphic transition type). As an independent bug, apply_partialaggref_adjustment neglected to resolve a polymorphic transition datatype before assigning it as the output type of the lower-level Aggref node. This again accidentally failed to fail for ANYARRAY but would be unlikely to work in other cases. To fix the first problem, record the user-level argument types in a separate OID-list field of Aggref, and look to that rather than the args list when asking what the argument types were. (It turns out to be convenient to include any "direct" arguments in this list too, although those are not currently subject to being overwritten.) Rather than adding yet another resolve_aggregate_transtype() call to fix the second problem, add an aggtranstype field to Aggref, and store the resolved transition type OID there when the planner first computes it. (By doing this in the planner and not the parser, we can allow the aggregate's transition type to change from time to time, although no DDL support yet exists for that.) This saves nothing of consequence for simple non-polymorphic aggregates, but for polymorphic transition types we save a catalog lookup during executor startup as well as several planner lookups that are new in 9.6 due to parallel query planning. In passing, fix an error that was introduced into count_agg_clauses_walker some time ago: it was applying exprTypmod() to something that wasn't an expression node at all, but a TargetEntry. exprTypmod silently returned -1 so that there was not an obvious failure, but this broke the intended sensitivity of aggregate space consumption estimates to the typmod of varchar and similar data types. This part needs to be back-patched. Catversion bump due to change of stored Aggref nodes. Discussion: <8229.1466109074@sss.pgh.pa.us>	2016-06-17 21:44:37 -04:00
Robert Haas	71d05a2c7b	pg_visibility: Add pg_truncate_visibility_map function. This requires some core changes as well so that we can properly WAL-log the truncation. Specifically, it changes the format of the XLOG_SMGR_TRUNCATE WAL record, so bump XLOG_PAGE_MAGIC. Patch by me, reviewed but not fully endorsed by Andres Freund.	2016-06-17 17:37:30 -04:00
Robert Haas	c7a25c242f	Mark some functions parallel-unsafe. currtid() and currtid2() call GetLatestSnapshot(), which fails in parallel mode. pg_export_snapshot() calls ExportSnapshot() which attempts to assign an XID for the current transaction if it does not already have one; that, too, will fail in parallel mode. Andreas Seltenreich	2016-06-15 11:40:07 -04:00
Tom Lane	cae1c788b9	Improve the situation for parallel query versus temp relations. Transmit the leader's temp-namespace state to workers. This is important because without it, the workers do not really have the same search path as the leader. For example, there is no good reason (and no extant code either) to prevent a worker from executing a temp function that the leader created previously; but as things stood it would fail to find the temp function, and then either fail or execute the wrong function entirely. We still prohibit a worker from creating a temp namespace on its own. In effect, a worker can only see the session's temp namespace if the leader had created it before starting the worker, which seems like the right semantics. Also, transmit the leader's BackendId to workers, and arrange for workers to use that when determining the physical file path of a temp relation belonging to their session. While the original intent was to prevent such accesses entirely, there were a number of holes in that, notably in places like dbsize.c which assume they can safely access temp rels of other sessions anyway. We might as well get this right, as a small down payment on someday allowing workers to access the leader's temp tables. (With this change, directly using "MyBackendId" as a relation or buffer backend ID is deprecated; you should use BackendIdForTempRelations() instead. I left a couple of such uses alone though, as they're not going to be reachable in parallel workers until we do something about localbuf.c.) Move the thou-shalt-not-access-thy-leader's-temp-tables prohibition down into localbuf.c, which is where it actually matters, instead of having it in relation_open(). This amounts to recognizing that access to temp tables' catalog entries is perfectly safe in a worker, it's only the data in local buffers that is problematic. Having done all that, we can get rid of the test in has_parallel_hazard() that says that use of a temp table's rowtype is unsafe in parallel workers. That test was unduly expensive, and if we really did need such a prohibition, that was not even close to being a bulletproof guard for it. (For example, any user-defined function executed in a parallel worker might have attempted such access.)	2016-06-09 20:16:11 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Tom Lane	7feb60c1bb	Clarify documentation of ceil/ceiling/floor functions. Document these as "nearest integer >= argument" and "nearest integer <= argument", which will hopefully be less confusing than the old formulation. New wording is from Matlab via Dean Rasheed. I changed the pg_description entries as well as the SGML docs. In the back branches, this will only affect installations initdb'd in the future, but it should be harmless otherwise. Discussion: <CAEZATCW3yzJo-NMSiQs5jXNFbTsCEftZS-Og8=FvFdiU+kYuSA@mail.gmail.com>	2016-06-09 11:58:00 -04:00
Tom Lane	16ea51a263	Pin the built-in index access methods. This was overlooked in commit `473b93287`, which introduced DROP ACCESS METHOD. Although that command is restricted to superusers, we don't want even superusers dropping the built-in methods; "DROP ACCESS METHOD btree" in particular is unrecoverable from. Pin these objects in the same way that other initdb-created objects are pinned. I chose to bump catversion for this fix. That's not absolutely necessary perhaps, but it will ensure that no 9.6 production systems are missing the pin entries.	2016-05-19 14:40:02 -04:00
Tom Lane	1a2c17f8e2	Fix pg_upgrade to not fail when new-cluster TOAST rules differ from old. This patch essentially reverts commit `4c6780fd17`, in favor of a much simpler solution for the case where the new cluster would choose to create a TOAST table but the old cluster doesn't have one: just don't create a TOAST table. The existing code failed in at least two different ways if the situation arose: (1) ALTER TABLE RESET didn't grab an exclusive lock, so that the lock sanity check in create_toast_table failed; (2) pg_upgrade did not provide a pg_type OID for the new toast table, so that the crosscheck in TypeCreate failed. While both these problems were introduced by later patches, they show that the hack being used to cause TOAST table creation is overwhelmingly fragile (and untested). I also note that before the TypeCreate crosscheck was added, the code would have resulted in assigning an indeterminate pg_type OID to the toast table, possibly causing a later OID conflict in that catalog; so that it didn't really work even when committed. If we simply don't create a TOAST table, there will only be a problem if the code tries to store a tuple that's wider than a page, and field compression isn't sufficient to get it under a page. Given that the TOAST creation threshold is intended to be about a quarter of a page, it's very hard to believe that cross-version differences in the do-we-need-a-toast- table heuristic could result in an observable problem. So let's just follow the old version's conclusion about whether a TOAST table is needed. (If we ever do change needs_toast_table() so much that this conclusion doesn't apply, we can devise a solution at that time, and hopefully do it in a less klugy way than `4c6780fd17` did.) Back-patch to 9.3, like the previous patch. Discussion: <8110.1462291671@sss.pgh.pa.us>	2016-05-06 22:05:56 -04:00
Tom Lane	0b9a234432	Rename tsvector delete() to ts_delete(), and filter() to ts_filter(). The similarity of the original names to SQL keywords seems like a bad idea. Rename them before we're stuck with 'em forever. In passing, minor code and docs cleanup. Discussion: <4875.1462210058@sss.pgh.pa.us>	2016-05-05 19:43:32 -04:00
Robert Haas	9888b34fdb	Fix more things to be parallel-safe. Conversion functions were previously marked as parallel-unsafe, since that is the default, but in fact they are safe. Parallel-safe functions defined in pg_proc.h and redefined in system_views.sql were ending up as parallel-unsafe because the redeclarations were not marked PARALLEL SAFE. While editing system_views.sql, mark ts_debug() parallel safe also. Andreas Karlsson	2016-05-03 14:36:38 -04:00
Robert Haas	37d0c2cb1a	Fix parallel safety markings for pg_start_backup. Commit `7117685461` made pg_start_backup parallel-restricted rather than parallel-safe, because it now relies on backend-private state that won't be synchronized with the parallel worker. However, it didn't update pg_proc.h. Separately, Andreas Karlsson observed that system_views.sql neglected to reiterate the parallel-safety markings whe redefining various functions, including this one; so add a PARALLEL RESTRICTED declaration there to match the new value in pg_proc.h.	2016-05-02 10:42:34 -04:00
Stephen Frost	7a542700df	Create default roles This creates an initial set of default roles which administrators may use to grant access to, historically, superuser-only functions. Using these roles instead of granting superuser access reduces the number of superuser roles required for a system. Documention for each of the default roles has been added to user-manag.sgml. Bump catversion to 201604082, as we had a commit that bumped it to 201604081 and another that set it back to 201604071... Reviews by José Luis Tallón and Robert Haas	2016-04-08 16:56:27 -04:00
Teodor Sigaev	8b99edefca	Revert CREATE INDEX ... INCLUDING ... It's not ready yet, revert two commits `690c543550` - unstable test output `386e3d7609` - patch itself	2016-04-08 21:52:13 +03:00
Robert Haas	af025eed53	Add combine functions for various floating-point aggregates. This allows parallel aggregation to use them. It may seem surprising that we use float8_combine for both float4_accum and float8_accum transition functions, but that's because those functions differ only in the type of the non-transition-state argument. Haribabu Kommi, reviewed by David Rowley and Tomas Vondra	2016-04-08 13:47:06 -04:00
Teodor Sigaev	386e3d7609	CREATE INDEX ... INCLUDING (column[, ...]) Now indexes (but only B-tree for now) can contain "extra" column(s) which doesn't participate in index structure, they are just stored in leaf tuples. It allows to use index only scan by using single index instead of two or more indexes. Author: Anastasia Lubennikova with minor editorializing by me Reviewers: David Rowley, Peter Geoghegan, Jeff Janes	2016-04-08 19:45:59 +03:00
Teodor Sigaev	bb140506df	Phrase full text search. Patch introduces new text search operator (<-> or <DISTANCE>) into tsquery. On-disk and binary in/out format of tsquery are backward compatible. It has two side effect: - change order for tsquery, so, users, who has a btree index over tsquery, should reindex it - less number of parenthesis in tsquery output, and tsquery becomes more readable Authors: Teodor Sigaev, Oleg Bartunov, Dmitry Ivanov Reviewers: Alexander Korotkov, Artur Zakirov	2016-04-07 18:44:18 +03:00
Stephen Frost	29dd1504a1	Bump catversion for pg_dump dump catalog ACL patches Pointed out by Tom.	2016-04-06 23:04:48 -04:00
Stephen Frost	23f34fa4ba	In pg_dump, include pg_catalog and extension ACLs, if changed Now that all of the infrastructure exists, add in the ability to dump out the ACLs of the objects inside of pg_catalog or the ACLs for objects which are members of extensions, but only if they have been changed from their original values. The original values are tracked in pg_init_privs. When pg_dump'ing 9.6-and-above databases, we will dump out the ACLs for all objects in pg_catalog and the ACLs for all extension members, where the ACL has been changed from the original value which was set during either initdb or CREATE EXTENSION. This should not change dumps against pre-9.6 databases. Reviews by Alexander Korotkov, Jose Luis Tallon	2016-04-06 21:45:32 -04:00
Stephen Frost	6c268df127	Add new catalog called pg_init_privs This new catalog holds the privileges which the system was initialized with at initdb time, along with any permissions set by extensions at CREATE EXTENSION time. This allows pg_dump (and any other similar use-cases) to detect when the privileges set on initdb-created or extension-created objects have been changed from what they were set to at initdb/extension-creation time and handle those changes appropriately. Reviews by Alexander Korotkov, Jose Luis Tallon	2016-04-06 21:45:32 -04:00
Teodor Sigaev	0b62fd036e	Add jsonb_insert It inserts a new value into an jsonb array at arbitrary position or a new key to jsonb object. Author: Dmitry Dolgov Reviewers: Petr Jelinek, Vitaly Burovoy, Andrew Dunstan	2016-04-06 19:25:00 +03:00
Simon Riggs	3fe3511d05	Generic Messages for Logical Decoding API and mechanism to allow generic messages to be inserted into WAL that are intended to be read by logical decoding plugins. This commit adds an optional new callback to the logical decoding API. Messages are either text or bytea. Messages can be transactional, or not, and are identified by a prefix to allow multiple concurrent decoding plugins. (Not to be confused with Generic WAL records, which are intended to allow crash recovery of extensible objects.) Author: Petr Jelinek and Andres Freund Reviewers: Artur Zakirov, Tomas Vondra, Simon Riggs Discussion: 5685F999.6010202@2ndquadrant.com	2016-04-06 10:05:41 +01:00
Alvaro Herrera	f2fcad27d5	Support ALTER THING .. DEPENDS ON EXTENSION This introduces a new dependency type which marks an object as depending on an extension, such that if the extension is dropped, the object automatically goes away; and also, if the database is dumped, the object is included in the dump output. Currently the grammar supports this for indexes, triggers, materialized views and functions only, although the utility code is generic so adding support for more object types is a matter of touching the parser rules only. Author: Abhijit Menon-Sen Reviewed-by: Alexander Korotkov, Álvaro Herrera Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org	2016-04-05 18:38:54 -03:00
Robert Haas	41ea0c2376	Fix parallel-safety code for parallel aggregation. has_parallel_hazard() was ignoring the proparallel markings for aggregates, which is no good. Fix that. There was no way to mark an aggregate as actually being parallel-safe, either, so add a PARALLEL option to CREATE AGGREGATE. Patch by me, reviewed by David Rowley.	2016-04-05 16:06:15 -04:00
Robert Haas	11c8669c0c	Add parallel query support functions for assorted aggregates. This lets us use parallel aggregate for a variety of useful cases that didn't work before, like sum(int8), sum(numeric), several versions of avg(), and various other functions. Add some regression tests, as well, testing the general sanity of these and future catalog entries. David Rowley, reviewed by Tomas Vondra, with a few further changes by me.	2016-04-05 14:32:53 -04:00
Magnus Hagander	7117685461	Implement backup API functions for non-exclusive backups Previously non-exclusive backups had to be done using the replication protocol and pg_basebackup. With this commit it's now possible to make them using pg_start_backup/pg_stop_backup as well, as long as the backup program can maintain a persistent connection to the database. Doing this, backup_label and tablespace_map are returned as results from pg_stop_backup() instead of being written to the data directory. This makes the server safe from a crash during an ongoing backup, which can be a problem with exclusive backups. The old syntax of the functions remain and work exactly as before, but since the new syntax is safer this should eventually be deprecated and removed. Only reference documentation is included. The main section on backup still needs to be rewritten to cover this, but since that is already scheduled for a separate large rewrite, it's not included in this patch. Reviewed by David Steele and Amit Kapila	2016-04-05 20:03:49 +02:00
Teodor Sigaev	2d02a856e8	Bump catalog version, forget in `acdf2a8b37`	2016-03-30 18:56:21 +03:00
Teodor Sigaev	acdf2a8b37	Introduce SP-GiST operator class over box. Patch implements quad-tree over boxes, naive approach of 2D quad tree will not work for any non-point objects because splitting space on node is not efficient. The idea of pathc is treating 2D boxes as 4D points, so, object will not overlap (in 4D space). The performance tests reveal that this technique especially beneficial with too much overlapping objects, so called "spaghetti data". Author: Alexander Lebedev with editorization by Emre Hasegeli and me	2016-03-30 18:42:36 +03:00
Tom Lane	e511d878f3	Allow to_timestamp(float8) to convert float infinity to timestamp infinity. With the original SQL-function implementation, such cases failed because we don't support infinite intervals. Converting the function to C lets us bypass the interval representation, which should be a bit faster as well as more flexible. Vitaly Burovoy, reviewed by Anastasia Lubennikova	2016-03-29 17:09:29 -04:00
Robert Haas	5fe5a2cee9	Allow aggregate transition states to be serialized and deserialized. This is necessary infrastructure for supporting parallel aggregation for aggregates whose transition type is "internal". Such values can't be passed between cooperating processes, because they are just pointers. David Rowley, reviewed by Tomas Vondra and by me.	2016-03-29 15:04:05 -04:00
Tom Lane	c94959d411	Fix DROP OPERATOR to reset oprcom/oprnegate links to the dropped operator. This avoids leaving dangling links in pg_operator; which while fairly harmless are also unsightly. While we're at it, simplify OperatorUpd, which went through heap_modify_tuple for no very good reason considering it had already made a tuple copy it could just scribble on. Roma Sokolov, reviewed by Tomas Vondra, additional hacking by Robert Haas and myself.	2016-03-25 12:33:16 -04:00
Alvaro Herrera	473b932870	Support CREATE ACCESS METHOD This enables external code to create access methods. This is useful so that extensions can add their own access methods which can be formally tracked for dependencies, so that DROP operates correctly. Also, having explicit support makes pg_dump work correctly. Currently only index AMs are supported, but we expect different types to be added in the future. Authors: Alexander Korotkov, Petr Jelínek Reviewed-By: Teodor Sigaev, Petr Jelínek, Jim Nasby Commitfest-URL: https://commitfest.postgresql.org/9/353/ Discussion: https://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com	2016-03-23 23:01:35 -03:00
Robert Haas	e06a38965b	Support parallel aggregation. Parallel workers can now partially aggregate the data and pass the transition values back to the leader, which can combine the partial results to produce the final answer. David Rowley, based on earlier work by Haribabu Kommi. Reviewed by Álvaro Herrera, Tomas Vondra, Amit Kapila, James Sewell, and me.	2016-03-21 09:30:18 -04:00
Peter Eisentraut	b555ed8102	Merge wal_level "archive" and "hot_standby" into new name "replica" The distinction between "archive" and "hot_standby" existed only because at the time "hot_standby" was added, there was some uncertainty about stability. This is now a long time ago. We would like to move forward with simplifying the replication configuration, but this distinction is in the way, because a primary server cannot tell (without asking a standby or predicting the future) which one of these would be the appropriate level. Pick a new name for the combined setting to make it clearer that it covers all (non-logical) backup and replication uses. The old values are still accepted but are converted internally. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: David Steele <david@pgmasters.net>	2016-03-18 23:56:03 +01:00
Teodor Sigaev	3187d6de0e	Introduce parse_ident() SQL-layer function to split qualified identifier into array parts. Author: Pavel Stehule with minor editorization by me and Jim Nasby	2016-03-18 18:16:14 +03:00
Robert Haas	c16dc1aca5	Add simple VACUUM progress reporting. There's a lot more that could be done here yet - in particular, this reports only very coarse-grained information about the index vacuuming phase - but even as it stands, the new pg_stat_progress_vacuum can tell you quite a bit about what a long-running vacuum is actually doing. Amit Langote and Robert Haas, based on earlier work by Vinayak Pokale and Rahila Syed.	2016-03-15 13:32:56 -04:00
Tom Lane	2da7549987	pg_stat_get_progress_info() should be marked STRICT. I didn't bother with a catversion bump. Report and patch by Thomas Munro	2016-03-14 12:51:55 -04:00
Teodor Sigaev	a9eb6c83ef	Bump catalog version missed in `6943a946c7`	2016-03-11 19:31:04 +03:00
Teodor Sigaev	6943a946c7	Tsvector editing functions Adds several tsvector editting function: convert tsvector to/from text array, set weight for given lexemes, delete lexeme(s), unnest, filter lexemes with given weights Author: Stas Kelvich with some editorization by me Reviewers: Tomas Vondram, Teodor Sigaev	2016-03-11 19:22:36 +03:00
Robert Haas	53be0b1add	Provide much better wait information in pg_stat_activity. When a process is waiting for a heavyweight lock, we will now indicate the type of heavyweight lock for which it is waiting. Also, you can now see when a process is waiting for a lightweight lock - in which case we will indicate the individual lock name or the tranche, as appropriate - or for a buffer pin. Amit Kapila, Ildus Kurbangaliev, reviewed by me. Lots of helpful discussion and suggestions by many others, including Alexander Korotkov, Vladimir Borodin, and many others.	2016-03-10 12:44:09 -05:00
Robert Haas	b6fb6471f6	Add a generic command progress reporting facility. Using this facility, any utility command can report the target relation upon which it is operating, if there is one, and up to 10 64-bit counters; the intent of this is that users should be able to figure out what a utility command is doing without having to resort to ugly hacks like attaching strace to a backend. As a demonstration, this adds very crude reporting to lazy vacuum; we just report the target relation and nothing else. A forthcoming patch will make VACUUM report a bunch of additional data that will make this much more interesting. But this gets the basic framework in place. Vinayak Pokale, Rahila Syed, Amit Langote, Robert Haas, reviewed by Kyotaro Horiguchi, Jim Nasby, Thom Brown, Masahiko Sawada, Fujii Masao, and Masanori Oyama.	2016-03-09 12:08:58 -05:00
Joe Conway	dc7d70ea05	Expose control file data via SQL accessible functions. Add four new SQL accessible functions: pg_control_system(), pg_control_checkpoint(), pg_control_recovery(), and pg_control_init() which expose a subset of the control file data. Along the way move the code to read and validate the control file to src/common, where it can be shared by the new backend functions and the original pg_controldata frontend program. Patch by me, significant input, testing, and review by Michael Paquier.	2016-03-05 11:10:19 -08:00
Tom Lane	eb43e851d6	Create stub functions to support pg_upgrade of old contrib/tsearch2. Commits `9ff60273e3` and `dbe2328959` adjusted the declarations of some core functions referenced by contrib/tsearch2's install script, forgetting that in a pg_upgrade situation, we'll be trying to restore operator class definitions that reference the old signatures. We've hit this problem before; solve it in the same way as before, namely by installing stub functions that have the expected signature and just invoke the correct function. Per report from Jeff Janes. (Someday we ought to stop supporting contrib/tsearch2, but I'm not sure today is that day.)	2016-03-02 17:37:54 -05:00
Robert Haas	a892234f83	Change the format of the VM fork to add a second bit per page. The new bit indicates whether every tuple on the page is already frozen. It is cleared only when the all-visible bit is cleared, and it can be set only when we vacuum a page and find that every tuple on that page is both visible to every transaction and in no need of any future vacuuming. A future commit will use this new bit to optimize away full-table scans that would otherwise be triggered by XID wraparound considerations. A page which is merely all-visible must still be scanned in that case, but a page which is all-frozen need not be. This commit does not attempt that optimization, although that optimization is the goal here. It seems better to get the basic infrastructure in place first. Per discussion, it's very desirable for pg_upgrade to automatically migrate existing VM forks from the old format to the new format. That, too, will be handled in a follow-on patch. Masahiko Sawada, reviewed by Kyotaro Horiguchi, Fujii Masao, Amit Kapila, Simon Riggs, Andres Freund, and others, and substantially revised by me.	2016-03-01 21:49:41 -05:00
Tom Lane	52f5d578d6	Create a function to reliably identify which sessions block which others. This patch introduces "pg_blocking_pids(int) returns int[]", which returns the PIDs of any sessions that are blocking the session with the given PID. Historically people have obtained such information using a self-join on the pg_locks view, but it's unreasonably tedious to do it that way with any modicum of correctness, and the addition of parallel queries has pretty much broken that approach altogether. (Given some more columns in the view than there are today, you could imagine handling parallel-query cases with a 4-way join; but ugh.) The new function has the following behaviors that are painful or impossible to get right via pg_locks: 1. Correctly understands which lock modes block which other ones. 2. In soft-block situations (two processes both waiting for conflicting lock modes), only the one that's in front in the wait queue is reported to block the other. 3. In parallel-query cases, reports all sessions blocking any member of the given PID's lock group, and reports a session by naming its leader process's PID, which will be the pg_backend_pid() value visible to clients. The motivation for doing this right now is mostly to fix the isolation tests. Commit `38f8bdcac4` lobotomized isolationtester's is-it-waiting query by removing its ability to recognize nonconflicting lock modes, as a crude workaround for the inability to handle soft-block situations properly. But even without the lock mode tests, the old query was excessively slow, particularly in CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new deadlock-hard test because the deadlock timeout elapses before they can probe the waiting status of all eight sessions. Replacing the pg_locks self-join with use of pg_blocking_pids() is not only much more correct, but a lot faster: I measure it at about 9X faster in a typical dev build with Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds. That should provide enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the test, without having to lengthen deadlock_timeout yet more and thus slow down the test for everyone else.	2016-02-22 14:31:43 -05:00
Dean Rasheed	53874c5228	Add pg_size_bytes() to parse human-readable size strings. This will parse strings in the format produced by pg_size_pretty() and return sizes in bytes. This allows queries to be written with clauses like "pg_total_relation_size(oid) > pg_size_bytes('10 GB')". Author: Pavel Stehule with various improvements by Vitaly Burovoy Discussion: http://www.postgresql.org/message-id/CAFj8pRD-tGoDKnxdYgECzA4On01_uRqPrwF-8LdkSE-6bDHp0w@mail.gmail.com Reviewed-by: Vitaly Burovoy, Oleksandr Shulgin, Kyotaro Horiguchi, Michael Paquier and Robert Haas	2016-02-20 09:57:27 +00:00
Joe Conway	a5c43b8869	Add new system view, pg_config Move and refactor the underlying code for the pg_config client application to src/common in support of sharing it with a new system information SRF called pg_config() which makes the same information available via SQL. Additionally wrap the SRF with a new system view, as called pg_config. Patch by me with extensive input and review by Michael Paquier and additional review by Alvaro Herrera.	2016-02-17 09:12:06 -08:00
Joe Conway	851636bfda	Move DATA entry to correct position In commit `7b4bfc87` the DATA and DESCR entries for the new row_security_active() function were inadvertantly put after the PROVOLATILE defines, rather than before as they should have been placed. Move them up where they belong. Backpatch to 9.5 where the new entries were introduced.	2016-02-15 16:38:47 -08:00
Tom Lane	d4c3a156cb	Remove GROUP BY columns that are functionally dependent on other columns. If a GROUP BY clause includes all columns of a non-deferred primary key, as well as other columns of the same relation, those other columns are redundant and can be dropped from the grouping; the pkey is enough to ensure that each row of the table corresponds to a separate group. Getting rid of the excess columns will reduce the cost of the sorting or hashing needed to implement GROUP BY, and can indeed remove the need for a sort step altogether. This seems worth testing for since many query authors are not aware of the GROUP-BY-primary-key exception to the rule about queries not being allowed to reference non-grouped-by columns in their targetlists or HAVING clauses. Thus, redundant GROUP BY items are not uncommon. Also, we can make the test pretty cheap in most queries where it won't help by not looking up a rel's primary key until we've found that at least two of its columns are in GROUP BY. David Rowley, reviewed by Julien Rouhaud	2016-02-11 17:34:59 -05:00
Tom Lane	72eee410d4	Move pg_constraint.h function declarations to new file pg_constraint_fn.h. A pending patch requires exporting a function returning Bitmapset from catalog/pg_constraint.c. As things stand, that would mean including nodes/bitmapset.h in pg_constraint.h, which might be hazardous for the client-side includability of that header. It's not entirely clear whether any client-side code needs to include pg_constraint.h, but it seems prudent to assume that there is some such code somewhere. Therefore, split off the function definitions into a new file pg_constraint_fn.h, similarly to what we've done for some other catalog header files.	2016-02-11 15:51:28 -05:00
Robert Haas	d89f06f048	Fix parallel-safety markings for pg_upgrade functions. These establish backend-local state which will not be copied to parallel workers, so they must be marked parallel-restricted, not parallel-safe.	2016-02-07 11:45:21 -05:00
Tom Lane	6819514fca	Add num_nulls() and num_nonnulls() to count NULL arguments. An example use-case is "CHECK(num_nonnulls(a,b,c) = 1)" to assert that exactly one of a,b,c isn't NULL. The functions are variadic, so they can also be pressed into service to count the number of null or nonnull elements in an array. Marko Tiikkaja, reviewed by Pavel Stehule	2016-02-04 23:03:37 -05:00
Robert Haas	b47b4dbf68	Extend sortsupport for text to more opclasses. Have varlena.c expose an interface that allows the char(n), bytea, and bpchar types to piggyback on a now-generalized SortSupport for text. This pushes a little more knowledge of the bpchar/char(n) type into varlena.c than might be preferred, but that seems like the approach that creates least friction. Also speed things up for index builds that use text_pattern_ops or varchar_pattern_ops. This patch does quite a bit of renaming, but it seems likely to be worth it, so as to avoid future confusion about the fact that this code is now more generally used than the old names might have suggested. Peter Geoghegan, reviewed by Álvaro Herrera and Andreas Karlsson, with small tweaks by me.	2016-02-03 14:29:53 -05:00
Tom Lane	2ad83fff22	Remove unnecessary "implementation of FOO operator" DESCR() entries. Apparently at least one committer hasn't gotten the word that these do not need to be maintained by hand, since initdb will create them automatically. Noted while fixing bug #13905. No catversion bump since the post-initdb state is exactly the same either way. I don't see a need for back-patch, either.	2016-02-02 11:52:27 -05:00
Tom Lane	a4627e8fd4	Fix pg_description entries for jsonb_to_record() and jsonb_to_recordset(). All the other jsonb function descriptions refer to the arguments as being "jsonb", but these two said "json". Make it consistent. Per bug #13905 from Petru Florin Mihancea. No catversion bump --- we can't force one in the back branches, and this isn't very critical anyway.	2016-02-02 11:39:50 -05:00
Fujii Masao	7f46eaf035	Add gin_clean_pending_list function to clean up GIN pending list This function cleans up the pending list of the GIN index by moving entries in it to the main GIN data structure in bulk. It returns the number of pages cleaned up from the pending list. This function is useful, for example, when the pending list needs to be cleaned up quickly to improve the performance of the search using GIN index. VACUUM can do the same thing, too, but it may take days to run on a large table. Jeff Janes, reviewed by Julien Rouhaud, Jaime Casanova, Alvaro Herrera and me. Discussion: CAMkU=1x8zFkpfnozXyt40zmR3Ub_kHu58LtRmwHUKRgQss7=iQ@mail.gmail.com	2016-01-28 12:57:52 +09:00
Fujii Masao	e09507a272	Fix volatility marking of pg_size_pretty function pg_size_pretty function should be marked immutable rather than volatile because it always returns the same result given the same argument. Pavel Stehule	2016-01-27 11:13:31 +09:00
Tom Lane	e1bd684a34	Add trigonometric functions that work in degrees. The implementations go to some lengths to deliver exact results for values where an exact result can be expected, such as sind(30) = 0.5 exactly. Dean Rasheed, reviewed by Michael Paquier	2016-01-22 15:46:22 -05:00
Robert Haas	a7de3dc5c3	Support multi-stage aggregation. Aggregate nodes now have two new modes: a "partial" mode where they output the unfinalized transition state, and a "finalize" mode where they accept unfinalized transition states rather than individual values as input. These new modes are not used anywhere yet, but they will be necessary for parallel aggregation. The infrastructure also figures to be useful for cases where we want to aggregate local data and remote data via the FDW interface, and want to bring back partial aggregates from the remote side that can then be combined with locally generated partial aggregates to produce the final value. It may also be useful even when neither FDWs nor parallelism are in play, as explained in the comments in nodeAgg.c. David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki Linnakangas, Haribabu Kommi, and me.	2016-01-20 13:46:50 -05:00
Tom Lane	dbe2328959	Fix assorted inconsistencies in GIN opclass support function declarations. GIN had some minor issues too, mostly using "internal" where something else would be more appropriate. I went with the same approach as in `9ff60273e3`, namely preferring the opclass' indexed datatype for arguments that receive an operator RHS value, even if that's not necessarily what they really are. Again, this is with an eye to having a uniform rule for ginvalidate() to check support function signatures.	2016-01-19 22:32:22 -05:00
Tom Lane	9ff60273e3	Fix assorted inconsistencies in GiST opclass support function declarations. The conventions specified by the GiST SGML documentation were widely ignored. For example, the strategy-number argument for "consistent" and "distance" functions is specified to be a smallint, but most of the built-in support functions declared it as an integer, and for that matter the core code passed it using Int32GetDatum not Int16GetDatum. None of that makes any real difference at runtime, but it's quite confusing for newcomers to the code, and it makes it very hard to write an amvalidate() function that checks support function signatures. So let's try to instill some consistency here. Another similar issue is that the "query" argument is not of a single well-defined type, but could have different types depending on the strategy (corresponding to search operators with different righthand-side argument types). Some of the functions threw up their hands and declared the query argument as being of "internal" type, which surely isn't right ("any" would have been more appropriate); but the majority position seemed to be to declare it as being of the indexed data type, corresponding to a search operator with both input types the same. So I've specified a convention that that's what to do always. Also, the result of the "union" support function actually must be of the index's storage type, but the documentation suggested declaring it to return "internal", and some of the functions followed that. Standardize on telling the truth, instead. Similarly, standardize on declaring the "same" function's inputs as being of the storage type, not "internal". Also, somebody had forgotten to add the "recheck" argument to both the documentation of the "distance" support function and all of their SQL declarations, even though the C code was happily using that argument. Clean that up too. Fix up some other omissions in the docs too, such as documenting that union's second input argument is vestigial. So far as the errors in core function declarations go, we can just fix pg_proc.h and bump catversion. Adjusting the erroneous declarations in contrib modules is more debatable: in principle any change in those scripts should involve an extension version bump, which is a pain. However, since these changes are purely cosmetic and make no functional difference, I think we can get away without doing that.	2016-01-19 12:04:36 -05:00
Tom Lane	65c5fcd353	Restructure index access method API to hide most of it at the C level. This patch reduces pg_am to just two columns, a name and a handler function. All the data formerly obtained from pg_am is now provided in a C struct returned by the handler function. This is similar to the designs we've adopted for FDWs and tablesample methods. There are multiple advantages. For one, the index AM's support functions are now simple C functions, making them faster to call and much less error-prone, since the C compiler can now check function signatures. For another, this will make it far more practical to define index access methods in installable extensions. A disadvantage is that SQL-level code can no longer see attributes of index AMs; in particular, some of the crosschecks in the opr_sanity regression test are no longer possible from SQL. We've addressed that by adding a facility for the index AM to perform such checks instead. (Much more could be done in that line, but for now we're content if the amvalidate functions more or less replace what opr_sanity used to do.) We might also want to expose some sort of reporting functionality, but this patch doesn't do that. Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily editorialized on by me.	2016-01-17 19:36:59 -05:00
Simon Riggs	e63bb4549a	Add new user fn pg_current_xlog_flush_location() Tomas Vondra, reviewed by Michael Paquier and Amit Kapila Minor edits by me	2016-01-12 07:54:52 +00:00
Tom Lane	26d538dc93	Clean up some lack-of-STRICT issues in the core code, too. A scan for missed proisstrict markings in the core code turned up these functions: brin_summarize_new_values pg_stat_reset_single_table_counters pg_stat_reset_single_function_counters pg_create_logical_replication_slot pg_create_physical_replication_slot pg_drop_replication_slot The first three of these take OID, so a null argument will normally look like a zero to them, resulting in "ERROR: could not open relation with OID 0" for brin_summarize_new_values, and no action for the pg_stat_reset_XXX functions. The other three will dump core on a null argument, though this is mitigated by the fact that they won't do so until after checking that the caller is superuser or has rolreplication privilege. In addition, the pg_logical_slot_get/peek[_binary]_changes family was intentionally marked nonstrict, but failed to make nullness checks on all the arguments; so again a null-pointer-dereference crash is possible but only for superusers and rolreplication users. Add the missing ARGISNULL checks to the latter functions, and mark the former functions as strict in pg_proc. Make that change in the back branches too, even though we can't force initdb there, just so that installations initdb'd in future won't have the issue. Since none of these bugs rise to the level of security issues (and indeed the pg_stat_reset_XXX functions hardly misbehave at all), it seems sufficient to do this. In addition, fix some order-of-operations oddities in the slot_get_changes family, mostly cosmetic, but not the part that moves the function's last few operations into the PG_TRY block. As it stood, there was significant risk for an error to exit without clearing historical information from the system caches. The slot_get_changes bugs go back to 9.4 where that code was introduced. Back-patch appropriate subsets of the pg_proc changes into all active branches, as well.	2016-01-09 16:58:32 -05:00
Alvaro Herrera	b1a9bad9e7	pgstat: add WAL receiver status view & SRF This new view provides insight into the state of a running WAL receiver in a HOT standby node. The information returned includes the PID of the WAL receiver process, its status (stopped, starting, streaming, etc), start LSN and TLI, last received LSN and TLI, timestamp of last message send and receipt, latest end-of-WAL LSN and time, and the name of the slot (if any). Access to the detailed data is only granted to superusers; others only get the PID. Author: Michael Paquier Reviewer: Haribabu Kommi	2016-01-07 16:21:19 -03:00
Alvaro Herrera	abb1733922	Add scale(numeric) Author: Marko Tiikkaja	2016-01-05 19:02:13 -03:00
Tom Lane	ea0d494dae	Make the to_reg*() functions accept text not cstring. Using cstring as the input type was a poor decision, because that's not really a full-fledged type. In particular, it lacks implicit coercions from text or varchar, meaning that usages like to_regproc('foo'\|\|'bar') wouldn't work; basically the only case that did work without explicit casting was a simple literal constant argument. The lack of field complaints about this suggests that hardly anyone is using these functions, so hopefully fixing it won't cause much of a compatibility problem. They've only been there since 9.4, anyway. Petr Korobeinikov	2016-01-05 13:02:43 -05:00
Alvaro Herrera	efa318bcfa	Make pg_shseclabel available in early backend startup While the in-core authentication mechanism doesn't need to access pg_shseclabel at all, it's reasonable to think that an authentication hook will want to look at the label for the role logging in, or for rows in other catalogs used during the authentication phase of startup. Catalog version bumped, because this changes the "is nailed" status for pg_shseclabel. Author: Adam Brightwell	2016-01-05 14:50:53 -03:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Tom Lane	0dab5ef39b	Fix ALTER OPERATOR to update dependencies properly. Fix an oversight in commit 321eed5f0f7563a0: replacing an operator's selectivity functions needs to result in a corresponding update in pg_depend. We have a function that can handle that, but it was not called by AlterOperator(). To fix this without enlarging pg_operator.h's #include list beyond what clients can safely include, split off the function definitions into a new file pg_operator_fn.h, similarly to what we've done for some other catalog header files. It's not entirely clear whether any client-side code needs to include pg_operator.h, but it seems prudent to assume that there is some such code somewhere.	2015-12-31 17:37:31 -05:00
Joe Conway	241448b23a	Rename (new\|old)estCommitTs to (new\|old)estCommitTsXid The variables newestCommitTs and oldestCommitTs sound as if they are timestamps, but in fact they are the transaction Ids that correspond to the newest and oldest timestamps rather than the actual timestamps. Rename these variables to reflect that they are actually xids: to wit newestCommitTsXid and oldestCommitTsXid respectively. Also modify related code in a similar fashion, particularly the user facing output emitted by pg_controldata and pg_resetxlog. Complaint and patch by me, review by Tom Lane and Alvaro Herrera. Backpatch to 9.5 where these variables were first introduced.	2015-12-28 12:34:11 -08:00
Tom Lane	074c5cfbfb	Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit `5ed6546cf`, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit `5f173040`. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.	2015-11-20 14:55:47 -05:00
Robert Haas	bc4996e61b	Make ALTER .. SET SCHEMA do nothing, instead of throwing an ERROR. This was already true for CREATE EXTENSION, but historically has not been true for other object types. Therefore, this is a backward incompatibility. Per discussion on pgsql-hackers, everyone seems to agree that the new behavior is better. Marti Raudsepp, reviewed by Haribabu Kommi and myself	2015-11-19 10:49:25 -05:00
Tom Lane	c5e86ea932	Add "xid <> xid" and "xid <> int4" operators. The corresponding "=" operators have been there a long time, and not having their negators is a bit of a nuisance. Michael Paquier	2015-11-07 16:40:15 -05:00
Robert Haas	a76ef15d9f	Add sort support routine for the UUID data type. This introduces a simple encoding scheme to produce abbreviated keys: pack as many bytes of each UUID as will fit into a Datum. On little-endian machines, a byteswap is also performed; the abbreviated comparator can therefore just consist of a simple 3-way unsigned integer comparison. The purpose of this change is to speed up sorting data on a column of type UUID. Peter Geoghegan	2015-11-06 12:14:35 -05:00
Robert Haas	816e336f12	Mark more functions parallel-restricted or parallel-unsafe. Commit `7aea8e4f2d` was overoptimistic about the degree of safety associated with running various functions in parallel mode. Functions that take a table name or OID as an argument are at least parallel-restricted, because the table might be temporary, and we currently don't allow parallel workers to touch temporary tables. Functions that take a query as an argument are outright unsafe, because the query could be anything, including a parallel-unsafe query. Also, the queue of pending notifications is backend-private, so adding to it from a worker doesn't behave correctly. We could fix this by transferring the worker's queue of pending notifications to the master during worker cleanup, but that seems like more trouble than it's worth for now. In addition to adjusting the pg_proc.h markings, also add an explicit check for this in async.c.	2015-10-16 11:49:31 -04:00
Bruce Momjian	b852dc4cbd	docs: clarify JSONB operator descriptions No catalog bump as the catalog changes are for SQL operator comments. Backpatch through 9.5	2015-10-07 09:06:49 -04:00
Stephen Frost	4158cc3793	Do not write out WCOs in Query The WithCheckOptions list in Query are only populated during rewrite and do not need to be written out or read in as part of a Query structure. Further, move WithCheckOptions to the bottom and add comments to clarify that it is only populated during rewrite. Back-patch to 9.5 with a catversion bump, as we are still in alpha.	2015-10-05 07:38:58 -04:00
Stephen Frost	088c83363a	ALTER TABLE .. FORCE ROW LEVEL SECURITY To allow users to force RLS to always be applied, even for table owners, add ALTER TABLE .. FORCE ROW LEVEL SECURITY. row_security=off overrides FORCE ROW LEVEL SECURITY, to ensure pg_dump output is complete (by default). Also add SECURITY_NOFORCE_RLS context to avoid data corruption when ALTER TABLE .. FORCE ROW SECURITY is being used. The SECURITY_NOFORCE_RLS security context is used only during referential integrity checks and is only considered in check_enable_rls() after we have already checked that the current user is the owner of the relation (which should always be the case during referential integrity checks). Back-patch to 9.5 where RLS was added.	2015-10-04 21:05:08 -04:00
Noah Misch	3cb0a7e75a	Make BYPASSRLS behave like superuser RLS bypass. Specifically, make its effect independent from the row_security GUC, and make it affect permission checks pertinent to views the BYPASSRLS role owns. The row_security GUC thereby ceases to change successful-query behavior; it can only make a query fail with an error. Back-patch to 9.5, where BYPASSRLS was introduced.	2015-10-03 20:19:57 -04:00
Robert Haas	7aea8e4f2d	Determine whether it's safe to attempt a parallel plan for a query. Commit `924bcf4f16` introduced a framework for parallel computation in PostgreSQL that makes most but not all built-in functions safe to execute in parallel mode. In order to have parallel query, we'll need to be able to determine whether that query contains functions (either built-in or user-defined) that cannot be safely executed in parallel mode. This requires those functions to be labeled, so this patch introduces an infrastructure for that. Some functions currently labeled as safe may need to be revised depending on how pending issues related to heavyweight locking under paralllelism are resolved. Parallel plans can't be used except for the case where the query will run to completion. If portal execution were suspended, the parallel mode restrictions would need to remain in effect during that time, but that might make other queries fail. Therefore, this patch introduces a framework that enables consideration of parallel plans only when it is known that the plan will be run to completion. This probably needs some refinement; for example, at bind time, we do not know whether a query run via the extended protocol will be execution to completion or run with a limited fetch count. Having the client indicate its intentions at bind time would constitute a wire protocol break. Some contexts in which parallel mode would be safe are not adjusted by this patch; the default is not to try parallel plans except from call sites that have been updated to say that such plans are OK. This commit doesn't introduce any parallel paths or plans; it just provides a way to determine whether they could potentially be used. I'm committing it on the theory that the remaining parallel sequential scan patches will also get committed to this release, hopefully in the not-too-distant future. Robert Haas and Amit Kapila. Reviewed (in earlier versions) by Noah Misch.	2015-09-16 15:38:47 -04:00
Andres Freund	6fcd88511f	Allow pg_create_physical_replication_slot() to reserve WAL. When creating a physical slot it's often useful to immediately reserve the current WAL position instead of only doing after the first feedback message arrives. That e.g. allows slots to guarantee that all the WAL for a base backup will be available afterwards. Logical slots already have to reserve WAL during creation, so generalize that logic into being usable for both physical and logical slots. Catversion bump because of the new parameter. Author: Gurjeet Singh Reviewed-By: Andres Freund Discussion: CABwTF4Wh_dBCzTU=49pFXR6coR4NW1ynb+vBqT+Po=7fuq5iCw@mail.gmail.com	2015-08-11 12:34:31 +02:00
Andres Freund	3f811c2d6f	Add confirmed_flush column to pg_replication_slots. There's no reason not to expose both restart_lsn and confirmed_flush since they have rather distinct meanings. The former is the oldest WAL still required and valid for both physical and logical slots, whereas the latter is the location up to which a logical slot's consumer has confirmed receiving data. Most of the time a slot will require older WAL (i.e. restart_lsn) than the confirmed position (i.e. confirmed_flush_lsn). Author: Marko Tiikkaja, editorialized by me Discussion: 559D110B.1020109@joh.to	2015-08-10 13:28:18 +02:00
Andres Freund	4eda0a6470	Don't include low level locking code from frontend code. Some frontend code like e.g. pg_xlogdump or pg_resetxlog, has to use backend headers. Unfortunately until now that code includes most of the locking code. It's generally not nice to expose such low level details, but `de6fd1c898` made that a hard problem. We fall back to defining 'inline' away if the compiler doesn't support it - that can cause linker errors like on buildfarm animal pademelon if a inline function references backend only code. To fix that problem separate definitions from lock.h that are required from frontend code into lockdefs.h and use it in the relevant places. I've only removed the minimal amount of necessary definitions for now - it might turn out that we want more for other reasons. To avoid such details being exposed again put some checks against being included from frontend code into atomics.h, lock.h, lwlock.h and s_lock.h. It's otherwise fairly easy to indirectly include these headers. Discussion: 20150806070902.GE12214@awork2.anarazel.de	2015-08-07 15:10:56 +02:00
Noah Misch	b8fe12a836	Reconcile nodes/*funcs.c with recent work. A few of the discrepancies had semantic significance, but I did not track down the resulting user-visible bugs, if any. Back-patch to 9.5, where all but one discrepancy appeared. The _equalCreateEventTrigStmt() situation dates to 9.3 but does not affect semantics. catversion bump due to readfuncs.c field order changes.	2015-08-05 20:44:27 -04:00
Alvaro Herrera	2834855cb9	Fix BRIN to use SnapshotAny during summarization For correctness of summarization results, it is critical that the snapshot used during the summarization scan is able to see all tuples that are live to all transactions -- including tuples inserted or deleted by in-progress transactions. Otherwise, it would be possible for a transaction to insert a tuple, then idle for a long time while a concurrent transaction executes summarization of the range: this would result in the inserted value not being considered in the summary. Previously we were trying to use a MVCC snapshot in conjunction with adding a "placeholder" tuple in the index: the snapshot would see all committed tuples, and the placeholder tuple would catch insertions by any new inserters. The hole is that prior insertions by transactions that are still in progress by the time the MVCC snapshot was taken were ignored. Kevin Grittner reported this as a bogus error message during vacuum with default transaction isolation mode set to repeatable read (because the error report mentioned a function name not being invoked during), but the problem is larger than that. To fix, tweak IndexBuildHeapRangeScan to have a new mode that behaves the way we need using SnapshotAny visibility rules. This change simplifies the BRIN code a bit, mainly by removing large comments that were mistaken. Instead, rely on the SnapshotAny semantics to provide what it needs. (The business about a placeholder tuple needs to remain: that covers the case that a transaction inserts a a tuple in a page that summarization already scanned.) Discussion: https://www.postgresql.org/message-id/20150731175700.GX2441@postgresql.org In passing, remove a couple of unused declarations from brin.h and reword a comment to be proper English. This part submitted by Kevin Grittner. Backpatch to 9.5, where BRIN was introduced.	2015-08-05 16:20:50 -03:00
Alvaro Herrera	e8e86fbc8b	Fix volatility marking of commit timestamp functions They are marked stable, but since they act on instantaneous state and it is possible to consult state of transactions as they commit, the results could change mid-query. They need to be marked volatile, and this commit does so. There would normally be a catversion bump here, but this is so much a niche feature and I don't believe there's real damage from the incorrect marking, that I refrained. Backpatch to 9.5, where commit timestamps where introduced. Per note from Fujii Masao.	2015-07-30 15:19:49 -03:00
Joe Conway	f781a0f1d8	Create a pg_shdepend entry for each role in TO clause of policies. CreatePolicy() and AlterPolicy() omit to create a pg_shdepend entry for each role in the TO clause. Fix this by creating a new shared dependency type called SHARED_DEPENDENCY_POLICY and assigning it to each role. Reported by Noah Misch. Patch by me, reviewed by Alvaro Herrera. Back-patch to 9.5 where RLS was introduced.	2015-07-28 16:01:53 -07:00
Joe Conway	1e2bd43b31	Bump catversion so that HEAD is beyond 9.5 As pointed out by Tom, since HEAD has progressed beyond 9.5 in terms of its catalog, we need to be sure catversion of HEAD is advanced beyond that of 9.5. Corrects my mistake in the pg_stats view commit `cfa928ff`.	2015-07-28 13:59:23 -07:00
Joe Conway	7b4bfc87d5	Plug RLS related information leak in pg_stats view. The pg_stats view is supposed to be restricted to only show rows about tables the user can read. However, it sometimes can leak information which could not otherwise be seen when row level security is enabled. Fix that by not showing pg_stats rows to users that would be subject to RLS on the table the row is related to. This is done by creating/using the newly introduced SQL visible function, row_security_active(). Along the way, clean up three call sites of check_enable_rls(). The second argument of that function should only be specified as other than InvalidOid when we are checking as a different user than the current one, as in when querying through a view. These sites were passing GetUserId() instead of InvalidOid, which can cause the function to return incorrect results if the current user has the BYPASSRLS privilege and row_security has been set to OFF. Additionally fix a bug causing RI Trigger error messages to unintentionally leak information when RLS is enabled, and other minor cleanup and improvements. Also add WITH (security_barrier) to the definition of pg_stats. Bumped CATVERSION due to new SQL functions and pg_stats view definition. Back-patch to 9.5 where RLS was introduced. Reported by Yaroslav. Patch by Joe Conway and Dean Rasheed with review and input by Michael Paquier and Stephen Frost.	2015-07-28 13:21:22 -07:00
Tom Lane	dd7a8f66ed	Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.	2015-07-25 14:39:00 -04:00
Alvaro Herrera	149b1dd840	Fix omission of OCLASS_TRANSFORM in object_classes[] This was forgotten in `cac7658205` (and its fixup `ad89a5d115`). Since it seems way too easy to miss this, this commit also introduces a mechanism to enforce that the array is consistent with the enum. Problem reported independently by Robert Haas and Jaimin Pan. Patches proposed by Jaimin Pan, Jim Nasby, Michael Paquier and myself, though I didn't use any of these and instead went with a cleaner approach suggested by Tom Lane. Backpatch to 9.5. Discussion: https://www.postgresql.org/message-id/CA+Tgmoa6SgDaxW_n_7SEhwBAc=mniYga+obUj5fmw4rU9_mLvA@mail.gmail.com https://www.postgresql.org/message-id/29788.1437411581@sss.pgh.pa.us	2015-07-21 13:20:53 +02:00
Robert Haas	a04bb65f70	Add new function pg_notification_queue_usage. This tells you what fraction of NOTIFY's queue is currently filled. Brendan Jurd, reviewed by Merlin Moncure and Gurjeet Singh. A few further tweaks by me.	2015-07-17 09:12:03 -04:00
Tom Lane	10fb48d66d	Add an optional missing_ok argument to SQL function current_setting(). This allows convenient checking for existence of a GUC from SQL, which is particularly useful when dealing with custom variables. David Christensen, reviewed by Jeevan Chalke	2015-07-02 16:41:07 -04:00
Heikki Linnakangas	7931622d1d	Fix name of argument to pg_stat_file. It's called "missing_ok" in the docs and in the C code. I refrained from doing a catversion bump for this, because the name of an input argument is just documentation, it has no effect on any callers. Michael Paquier	2015-07-02 12:15:13 +03:00
Tom Lane	62d16c7fc5	Improve design and implementation of pg_file_settings view. As first committed, this view reported on the file contents as they were at the last SIGHUP event. That's not as useful as reporting on the current contents, and what's more, it didn't work right on Windows unless the current session had serviced at least one SIGHUP. Therefore, arrange to re-read the files when pg_show_all_settings() is called. This requires only minor refactoring so that we can pass changeVal = false to set_config_option() so that it won't actually apply any changes locally. In addition, add error reporting so that errors that would prevent the configuration files from being loaded, or would prevent individual settings from being applied, are visible directly in the view. This makes the view usable for pre-testing whether edits made in the config files will have the desired effect, before one actually issues a SIGHUP. I also added an "applied" column so that it's easy to identify entries that are superseded by later entries; this was the main use-case for the original design, but it seemed unnecessarily hard to use for that. Also fix a 9.4.1 regression that allowed multiple entries for a PGC_POSTMASTER variable to cause bogus complaints in the postmaster log. (The issue here was that commit `bf007a27ac` unintentionally reverted `3e3f65973a`, which suppressed any duplicate entries within ParseConfigFp. However, since the original coding of the pg_file_settings view depended on such suppression not happening, we couldn't have fixed this issue now without first doing something with pg_file_settings. Now we suppress duplicates by marking them "ignored" within ProcessConfigFileInternal, which doesn't hide them in the view.) Lesser changes include: Drive the view directly off the ConfigVariable list, instead of making a basically-equivalent second copy of the data. There's no longer any need to hang onto the data permanently, anyway. Convert show_all_file_settings() to do its work in one call and return a tuplestore; this avoids risks associated with assuming that the GUC state will hold still over the course of query execution. (I think there were probably latent bugs here, though you might need something like a cursor on the view to expose them.) Arrange to run SIGHUP processing in a short-lived memory context, to forestall process-lifespan memory leaks. (There is one known leak in this code, in ProcessConfigDirectory; it seems minor enough to not be worth back-patching a specific fix for.) Remove mistaken assignment to ConfigFileLineno that caused line counting after an include_dir directive to be completely wrong. Add missed failure check in AlterSystemSetConfigFile(). We don't really expect ParseConfigFp() to fail, but that's not an excuse for not checking.	2015-06-28 18:06:14 -04:00
Heikki Linnakangas	cb2acb1081	Add missing_ok option to the SQL functions for reading files. This makes it possible to use the functions without getting errors, if there is a chance that the file might be removed or renamed concurrently. pg_rewind needs to do just that, although this could be useful for other purposes too. (The changes to pg_rewind to use these functions will come in a separate commit.) The read_binary_file() function isn't very well-suited for extensions.c's purposes anymore, if it ever was. So bite the bullet and make a copy of it in extension.c, tailored for that use case. This seems better than the accidental code reuse, even if it's a some more lines of code. Michael Paquier, with plenty of kibitzing by me.	2015-06-28 21:35:46 +03:00
Andrew Dunstan	908e234733	Rename jsonb - text[] operator to #- to avoid ambiguity. Following recent discussion on -hackers. The underlying function is also renamed to jsonb_delete_path. The regression tests now don't need ugly type casts to avoid the ambiguity, so they are also removed. Catalog version bumped.	2015-06-11 10:06:58 -04:00
Fujii Masao	ea9c4c1e4a	Fix typo in comment. David Rowley	2015-06-10 15:26:02 +09:00
Andrew Dunstan	37def42245	Rename jsonb_replace to jsonb_set and allow it to add new values The function is given a fourth parameter, which defaults to true. When this parameter is true, if the last element of the path is missing in the original json, jsonb_set creates it in the result and assigns it the new value. If it is false then the function does nothing unless all elements of the path are present, including the last. Based on some original code from Dmitry Dolgov, heavily modified by me. Catalog version bumped.	2015-05-31 20:34:10 -04:00
Tom Lane	1c8c656b3c	Check that all aliases of a built-in function have same leakproof property. opr_sanity.sql has a test checking that relevant properties of built-in functions match when the same C function is referenced by multiple pg_proc entries. The test neglected to check proleakproof, though, and when I added that condition it exposed that xideqint4 hadn't been updated to match xideq. So fix that as well, and in consequence bump catversion. This isn't very critical, so no need to worry about fixing back branches.	2015-05-29 13:26:21 -04:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Andres Freund	631d749007	Remove the new UPSERT command tag and use INSERT instead. Previously, INSERT with ON CONFLICT DO UPDATE specified used a new command tag -- UPSERT. It was introduced out of concern that INSERT as a command tag would be a misrepresentation for ON CONFLICT DO UPDATE, as some affected rows may actually have been updated. Alvaro Herrera noticed that the implementation of that new command tag was incomplete; in subsequent discussion we concluded that having it doesn't provide benefits that are in line with the compatibility breaks it requires. Catversion bump due to the removal of PlannedStmt->isUpsert. Author: Peter Geoghegan Discussion: 20150520215816.GI5885@postgresql.org	2015-05-23 00:58:45 +02:00
Heikki Linnakangas	4fc72cc7bb	Collection of typo fixes. Use "a" and "an" correctly, mostly in comments. Two error messages were also fixed (they were just elogs, so no translation work required). Two function comments in pg_proc.h were also fixed. Etsuro Fujita reported one of these, but I found a lot more with grep. Also fix a few other typos spotted while grepping for the a/an typos. For example, "consists out of ..." -> "consists of ...". Plus a "though"/ "through" mixup reported by Euler Taveira. Many of these typos were in old code, which would be nice to backpatch to make future backpatching easier. But much of the code was new, and I didn't feel like crafting separate patches for each branch. So no backpatching.	2015-05-20 16:56:22 +03:00
Tom Lane	0b28ea79c0	Avoid collation dependence in indexes of system catalogs. No index in template0 should have collation-dependent ordering, especially not indexes on shared catalogs. For most textual columns we avoid this issue by using type "name" (which sorts per strcmp()). However there are a few indexed columns that we'd prefer to use "text" for, and for that, the default opclass text_ops is unsafe. Fortunately, text_pattern_ops is safe (it sorts per memcmp()), and it has no real functional disadvantage for our purposes. So change the indexes on pg_seclabel.provider and pg_shseclabel.provider to use text_pattern_ops. In passing, also mark pg_replication_origin.roname as using text_pattern_ops --- for some reason it was labeled varchar_pattern_ops which is just wrong, even though it accidentally worked. Add regression test queries to catch future errors of these kinds. We still can't do anything about the misdeclared pg_seclabel and pg_shseclabel indexes in back branches :-(	2015-05-19 11:47:42 -04:00
Tom Lane	afee04352b	Revert "Change pg_seclabel.provider and pg_shseclabel.provider to type "name"." This reverts commit `b82a7be603`. There is a better (less invasive) way to fix it, which I will commit next.	2015-05-19 10:40:04 -04:00
Tom Lane	b82a7be603	Change pg_seclabel.provider and pg_shseclabel.provider to type "name". These were "text", but that's a bad idea because it has collation-dependent ordering. No index in template0 should have collation-dependent ordering, especially not indexes on shared catalogs. There was general agreement that provider names don't need to be longer than other identifiers, so we can fix this at a small waste of table space by changing from text to name. There's no way to fix the problem in the back branches, but we can hope that security labels don't yet have widespread-enough usage to make it urgent to fix. There needs to be a regression sanity test to prevent us from making this same mistake again; but before putting that in, we'll need to get rid of similar brain fade in the recently-added pg_replication_origin catalog. Note: for lack of a suitable testing environment, I've not really exercised this change. I trust the buildfarm will show up any mistakes.	2015-05-18 20:07:53 -04:00
Andres Freund	f3d3118532	Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com	2015-05-16 03:46:31 +02:00
Alvaro Herrera	b0b7be6133	Add BRIN infrastructure for "inclusion" opclasses This lets BRIN be used with R-Tree-like indexing strategies. Also provided are operator classes for range types, box and inet/cidr. The infrastructure provided here should be sufficient to create operator classes for similar datatypes; for instance, opclasses for PostGIS geometries should be doable, though we didn't try to implement one. (A box/point opclass was also submitted, but we ripped it out before commit because the handling of floating point comparisons in existing code is inconsistent and would generate corrupt indexes.) Author: Emre Hasegeli. Cosmetic changes by me Review: Andreas Karlsson	2015-05-15 18:05:22 -03:00
Simon Riggs	f6d208d6e5	TABLESAMPLE, SQL Standard and extensible Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use cases for the sampling API follow in later commits. Petr Jelinek Reviewed by Michael Paquier and Simon Riggs	2015-05-15 14:37:10 -04:00
Heikki Linnakangas	35fcb1b3d0	Allow GiST distance function to return merely a lower-bound. The distance function can now set *recheck = false, like index quals. The executor will then re-check the ORDER BY expressions, and use a queue to reorder the results on the fly. This makes it possible to do kNN-searches on polygons and circles, which don't store the exact value in the index, but just a bounding box. Alexander Korotkov and me	2015-05-15 14:26:51 +03:00
Fujii Masao	ecd222e770	Support VERBOSE option in REINDEX command. When this option is specified, a progress report is printed as each index is reindexed. Per discussion, we agreed on the following syntax for the extensibility of the options. REINDEX (flexible options) { INDEX \| ... } name Sawada Masahiko. Reviewed by Robert Haas, Fabrízio Mello, Alvaro Herrera, Kyotaro Horiguchi, Jim Nasby and me. Discussion: CAD21AoA0pK3YcOZAFzMae+2fcc3oGp5zoRggDyMNg5zoaWDhdQ@mail.gmail.com	2015-05-15 20:09:57 +09:00
Peter Eisentraut	a486e35706	Add pg_settings.pending_restart column with input from David G. Johnston, Robert Haas, Michael Paquier	2015-05-14 20:08:51 -04:00
Andrew Dunstan	5c7df74204	Fix some errors from jsonb functions patch. The catalog version should have been bumped, and the alternative regression result file was not up to date with the name of jsonb_pretty.	2015-05-12 16:54:38 -04:00
Andrew Dunstan	c6947010ce	Additional functions and operators for jsonb jsonb_pretty(jsonb) produces nicely indented json output. jsonb \|\| jsonb concatenates two jsonb values. jsonb - text removes a key and its associated value from the json jsonb - int removes the designated array element jsonb - text[] removes a key and associated value or array element at the designated path jsonb_replace(jsonb,text[],jsonb) replaces the array element designated by the path or the value associated with the key designated by the path with the given value. Original work by Dmitry Dolgov, adapted and reworked for PostgreSQL core by Andrew Dunstan, reviewed and tidied up by Petr Jelinek.	2015-05-12 15:52:45 -04:00
Alvaro Herrera	b488c580ae	Allow on-the-fly capture of DDL event details This feature lets user code inspect and take action on DDL events. Whenever a ddl_command_end event trigger is installed, DDL actions executed are saved to a list which can be inspected during execution of a function attached to ddl_command_end. The set-returning function pg_event_trigger_ddl_commands can be used to list actions so captured; it returns data about the type of command executed, as well as the affected object. This is sufficient for many uses of this feature. For the cases where it is not, we also provide a "command" column of a new pseudo-type pg_ddl_command, which is a pointer to a C structure that can be accessed by C code. The struct contains all the info necessary to completely inspect and even reconstruct the executed command. There is no actual deparse code here; that's expected to come later. What we have is enough infrastructure that the deparsing can be done in an external extension. The intention is that we will add some deparsing code in a later release, as an in-core extension. A new test module is included. It's probably insufficient as is, but it should be sufficient as a starting point for a more complete and future-proof approach. Authors: Álvaro Herrera, with some help from Andres Freund, Ian Barwick, Abhijit Menon-Sen. Reviews by Andres Freund, Robert Haas, Amit Kapila, Michael Paquier, Craig Ringer, David Steele. Additional input from Chris Browne, Dimitri Fontaine, Stephen Frost, Petr Jelínek, Tom Lane, Jim Nasby, Steven Singer, Pavel Stěhule. Based on original work by Dimitri Fontaine, though I didn't use his code. Discussion: https://www.postgresql.org/message-id/m2txrsdzxa.fsf@2ndQuadrant.fr https://www.postgresql.org/message-id/20131108153322.GU5809@eldon.alvh.no-ip.org https://www.postgresql.org/message-id/20150215044814.GL3391@alvh.no-ip.org	2015-05-11 19:14:31 -03:00
Andrew Dunstan	cb9fa802b3	Add new OID alias type regnamespace Catalog version bumped Kyotaro HORIGUCHI	2015-05-09 13:36:52 -04:00
Andrew Dunstan	0c90f6769d	Add new OID alias type regrole The new type has the scope of whole the database cluster so it doesn't behave the same as the existing OID alias types which have database scope, concerning object dependency. To avoid confusion constants of the new type are prohibited from appearing where dependencies are made involving it. Also, add a note to the docs about possible MVCC violation and optimization issues, which are general over the all reg* types. Kyotaro Horiguchi	2015-05-09 13:06:49 -04:00
Stephen Frost	4b342fb591	Bump catversion for pg_file_settings Pointed out by Andres (thanks!) Apologies for not including it in the initial patch.	2015-05-08 19:14:32 -04:00
Stephen Frost	a97e0c3354	Add pg_file_settings view and function The function and view added here provide a way to look at all settings in postgresql.conf, any #include'd files, and postgresql.auto.conf (which is what backs the ALTER SYSTEM command). The information returned includes the configuration file name, line number in that file, sequence number indicating when the parameter is loaded (useful to see if it is later masked by another definition of the same parameter), parameter name, and what it is set to at that point. This information is updated on reload of the server. This is unfiltered, privileged, information and therefore access is restricted to superusers through the GRANT system. Author: Sawada Masahiko, various improvements by me. Reviewers: David Steele	2015-05-08 19:09:26 -04:00
Andres Freund	168d5805e4	Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.	2015-05-08 05:43:10 +02:00
Andres Freund	2c8f4836db	Represent columns requiring insert and update privileges indentently. Previously, relation range table entries used a single Bitmapset field representing which columns required either UPDATE or INSERT privileges, despite the fact that INSERT and UPDATE privileges are separately cataloged, and may be independently held. As statements so far required either insert or update privileges but never both, that was sufficient. The required permission could be inferred from the top level statement run. The upcoming INSERT ... ON CONFLICT UPDATE feature needs to independently check for both privileges in one statement though, so that is not sufficient anymore. Bumps catversion as stored rules change. Author: Peter Geoghegan Reviewed-By: Andres Freund	2015-05-08 00:20:46 +02:00
Alvaro Herrera	db5f98ab4f	Improve BRIN infra, minmax opclass and regression test The minmax opclass was using the wrong support functions when cross-datatypes queries were run. Instead of trying to fix the pg_amproc definitions (which apparently is not possible), use the already correct pg_amop entries instead. This requires jumping through more hoops (read: extra syscache lookups) to obtain the underlying functions to execute, but it is necessary for correctness. Author: Emre Hasegeli, tweaked by Álvaro Review: Andreas Karlsson Also change BrinOpcInfo to record each stored type's typecache entry instead of just the OID. Turns out that the full type cache is necessary in brin_deform_tuple: the original code used the indexed type's byval and typlen properties to extract the stored tuple, which is correct in Minmax; but in other implementations that want to store something different, that's wrong. The realization that this is a bug comes from Emre also, but I did not use his patch. I also adopted Emre's regression test code (with smallish changes), which is more complete.	2015-05-07 13:02:22 -03:00
Alvaro Herrera	3b6db1f445	Add geometry/range functions to support BRIN inclusion This commit adds the following functions: box(point) -> box bound_box(box, box) -> box inet_same_family(inet, inet) -> bool inet_merge(inet, inet) -> cidr range_merge(anyrange, anyrange) -> anyrange The first of these is also used to implement a new assignment cast from point to box. These functions are the first part of a base to implement an "inclusion" operator class for BRIN, for multidimensional data types. Author: Emre Hasegeli Reviewed by: Andreas Karlsson	2015-05-05 15:22:24 -03:00
Tom Lane	2503982be4	Improve procost estimates for some text search functions. The text search functions that involve parsing raw text into lexemes are remarkably CPU-intensive, so estimating them at the same cost as most other built-in functions seems like a mistake; moreover, doing so turns out to discourage the optimizer from using functional indexes on these functions. After some debate, we've agreed to raise procost from 1 to 100 for to_tsvector(), plainto_tsvector(), to_tsquery(), ts_headline(), ts_match_tt(), and ts_match_tq(), which are all the text search functions that parse raw text. Also increase procost for the 2-argument form of ts_rewrite() (tsquery_rewrite_query); while this function doesn't do text parsing, it does execute a user-supplied SQL query, so its previous procost of 1 is clearly a drastic underestimate. It seems reasonable to assign it the same cost we assign to PL functions by default, so 100 is the number here too. I did not bother bumping catversion for this change, since it does not break catalog compatibility with the server executable nor result in any regression test changes. Per complaint from Andrew Gierth and subsequent discussion.	2015-05-04 15:38:57 -04:00
Andres Freund	2b22795b32	Copy editing of the replication origins patch. Michael Paquier and myself.	2015-05-01 12:22:13 +02:00
Robert Haas	924bcf4f16	Create an infrastructure for parallel computation in PostgreSQL. This does four basic things. First, it provides convenience routines to coordinate the startup and shutdown of parallel workers. Second, it synchronizes various pieces of state (e.g. GUCs, combo CID mappings, transaction snapshot) from the parallel group leader to the worker processes. Third, it prohibits various operations that would result in unsafe changes to that state while parallelism is active. Finally, it propagates events that would result in an ErrorResponse, NoticeResponse, or NotifyResponse message being sent to the client from the parallel workers back to the master, from which they can then be sent on to the client. Robert Haas, Amit Kapila, Noah Misch, Rushabh Lathia, Jeevan Chalke. Suggestions and review from Andres Freund, Heikki Linnakangas, Noah Misch, Simon Riggs, Euler Taveira, and Jim Nasby.	2015-04-30 15:02:14 -04:00
Andres Freund	5aa2350426	Introduce replication progress tracking infrastructure. When implementing a replication solution ontop of logical decoding, two related problems exist: * How to safely keep track of replication progress * How to change replication behavior, based on the origin of a row; e.g. to avoid loops in bi-directional replication setups The solution to these problems, as implemented here, consist out of three parts: 1) 'replication origins', which identify nodes in a replication setup. 2) 'replication progress tracking', which remembers, for each replication origin, how far replay has progressed in a efficient and crash safe manner. 3) The ability to filter out changes performed on the behest of a replication origin during logical decoding; this allows complex replication topologies. E.g. by filtering all replayed changes out. Most of this could also be implemented in "userspace", e.g. by inserting additional rows contain origin information, but that ends up being much less efficient and more complicated. We don't want to require various replication solutions to reimplement logic for this independently. The infrastructure is intended to be generic enough to be reusable. This infrastructure also replaces the 'nodeid' infrastructure of commit timestamps. It is intended to provide all the former capabilities, except that there's only 2^16 different origins; but now they integrate with logical decoding. Additionally more functionality is accessible via SQL. Since the commit timestamp infrastructure has also been introduced in 9.5 (commit `73c986add`) changing the API is not a problem. For now the number of origins for which the replication progress can be tracked simultaneously is determined by the max_replication_slots GUC. That GUC is not a perfect match to configure this, but there doesn't seem to be sufficient reason to introduce a separate new one. Bumps both catversion and wal page magic. Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer Discussion: 20150216002155.GI15326@awork2.anarazel.de, 20140923182422.GA15776@alap3.anarazel.de, 20131114172632.GE7522@alap2.anarazel.de	2015-04-29 19:30:53 +02:00
Peter Eisentraut	cac7658205	Add transforms feature This provides a mechanism for specifying conversions between SQL data types and procedural languages. As examples, there are transforms for hstore and ltree for PL/Perl and PL/Python. reviews by Pavel Stěhule and Andres Freund	2015-04-26 10:33:14 -04:00
Andres Freund	cef939c347	Rename pg_replication_slot's new active_in to active_pid. In `d811c037ce` active_in was added but discussion since showed that active_pid is preferred as a name. Discussion: CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com	2015-04-22 09:43:40 +02:00
Andres Freund	d811c037ce	Add 'active_in' column to pg_replication_slots. Right now it is visible whether a replication slot is active in any session, but not in which. Adding the active_in column, containing the pid of the backend having acquired the slot, makes it much easier to associate pg_replication_slots entries with the corresponding pg_stat_replication/pg_stat_activity row. This should have been done from the start, but I (Andres) dropped the ball there somehow. Author: Craig Ringer, revised by me Discussion: CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com	2015-04-21 11:51:06 +02:00
Bruce Momjian	f92fc4c95d	pg_upgrade: binary_upgrade_create_empty_extension() is strict Was broken by commit `30982be4e5`. Patch by Jeff Janes	2015-04-17 20:08:42 -04:00
Peter Eisentraut	30982be4e5	Integrate pg_upgrade_support module into backend Previously, these functions were created in a schema "binary_upgrade", which was deleted after pg_upgrade was finished. Because we don't want to keep that schema around permanently, move them to pg_catalog but rename them with a binary_upgrade_... prefix. The provided functions are only small wrappers around global variables that were added specifically for pg_upgrade use, so keeping the module separate does not create any modularity. The functions still check that they are only called in binary upgrade mode, so it is not possible to call these during normal operation. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-14 19:26:37 -04:00
Heikki Linnakangas	4f700bcd20	Reorganize our CRC source files again. Now that we use CRC-32C in WAL and the control file, the "traditional" and "legacy" CRC-32 variants are not used in any frontend programs anymore. Move the code for those back from src/common to src/backend/utils/hash. Also move the slicing-by-8 implementation (back) to src/port. This is in preparation for next patch that will add another implementation that uses Intel SSE 4.2 instructions to calculate CRC-32C, where available.	2015-04-14 17:03:42 +03:00
Magnus Hagander	9029f4b374	Add system view pg_stat_ssl This view shows information about all connections, such as if the connection is using SSL, which cipher is used, and which client certificate (if any) is used. Reviews by Alex Shulgin, Heikki Linnakangas, Andres Freund & Michael Paquier	2015-04-12 19:07:46 +02:00
Alvaro Herrera	e9a077cad3	pg_event_trigger_dropped_objects: add is_temp column It now also reports temporary objects dropped that are local to the backend. Previously we weren't reporting any temp objects because it was deemed unnecessary; but as it turns out, it is necessary if we want to keep close track of DDL command execution inside one session. Temp objects are reported as living in schema pg_temp, which works because such a schema-qualification always refers to the temp objects of the current session.	2015-04-06 11:40:55 -03:00
Robert Haas	abd94bcac4	Use abbreviated keys for faster sorting of numeric datums. Andrew Gierth, reviewed by Peter Geoghegan, with further tweaks by me.	2015-04-02 14:04:26 -04:00
Heikki Linnakangas	f770870d9e	Move inet/cidr GiST opclass functions to correct place in header file. They were accidentally placed under the GIN heading. Andreas Karlsson	2015-04-01 19:20:45 +03:00
Alvaro Herrera	97690ea6e8	Change array_offset to return subscripts, not offsets ... and rename it and its sibling array_offsets to array_position and array_positions, to account for the changed behavior. Having the functions return subscripts better matches existing practice, and is better suited to using the result value as a subscript into the array directly. For one-based arrays, the new definition is identical to what was originally committed. (We use the term "subscript" in the documentation, which is what we use whenever we talk about arrays; but the functions themselves are named using the word "position" to match the standard-defined POSITION() functions.) Author: Pavel Stěhule Behavioral problem noted by Dean Rasheed.	2015-03-30 16:13:21 -03:00
Heikki Linnakangas	0633a60f4d	Add index-only scan support to range type GiST opclass. Andreas Karlsson	2015-03-30 13:22:38 +03:00
Heikki Linnakangas	3a20b0e7b6	Add index-only scan support to inet GiST opclass. Andreas Karlsson	2015-03-28 15:11:53 +02:00
Heikki Linnakangas	d04c8ed904	Add support for index-only scans in GiST. This adds a new GiST opclass method, 'fetch', which is used to reconstruct the original Datum from the value stored in the index. Also, the 'canreturn' index AM interface function gains a new 'attno' argument. That makes it possible to use index-only scans on a multi-column index where some of the opclasses support index-only scans but some do not. This patch adds support in the box and point opclasses. Other opclasses can added later as follow-on patches (btree_gist would be particularly interesting). Anastasia Lubennikova, with additional fixes and modifications by me.	2015-03-26 19:12:00 +02:00
Alvaro Herrera	bdc3d7fa23	Return ObjectAddress in many ALTER TABLE sub-routines Since commit `a2e35b53c3`, most CREATE and ALTER commands return the ObjectAddress of the affected object. This is useful for event triggers to try to figure out exactly what happened. This patch extends this idea a bit further to cover ALTER TABLE as well: an auxiliary ObjectAddress is returned for each of several subcommands of ALTER TABLE. This makes it possible to decode with precision what happened during execution of any ALTER TABLE command; for instance, which constraint was added by ALTER TABLE ADD CONSTRAINT, or which parent got dropped from the parents list by ALTER TABLE NO INHERIT. As with the previous patch, there is no immediate user-visible change here. This is all really just continuing what `c504513f83` started. Reviewed by Stephen Frost.	2015-03-25 17:17:56 -03:00
Bruce Momjian	1c7087af42	Add TOAST table to pg_shseclabel for long label use Report by Andres Freund	2015-03-21 22:14:49 -04:00
Andres Freund	959277a4f5	Use 128-bit math to accelerate some aggregation functions. On platforms where we support 128bit integers, use them to implement faster transition functions for sum(int8), avg(int8), var_(int2/int4),stdev_(int2/int4). Where not supported continue to use numeric as a transition type. In some synthetic benchmarks this has been shown to provide significant speedups. Bumps catversion. Discussion: 544BB5F1.50709@proxel.se Author: Andreas Karlsson Reviewed-By: Peter Geoghegan, Petr Jelinek, Andres Freund, Oskari Saarenmaa, David Rowley	2015-03-20 10:29:32 +01:00
Alvaro Herrera	13dbc7a824	array_offset() and array_offsets() These functions return the offset position or positions of a value in an array. Author: Pavel Stěhule Reviewed by: Jim Nasby	2015-03-18 16:01:34 -03:00
Tom Lane	7b8b8a4331	Improve representation of PlanRowMark. This patch fixes two inadequacies of the PlanRowMark representation. First, that the original LockingClauseStrength isn't stored (and cannot be inferred for foreign tables, which always get ROW_MARK_COPY). Since some PlanRowMarks are created out of whole cloth and don't actually have an ancestral RowMarkClause, this requires adding a dummy LCS_NONE value to enum LockingClauseStrength, which is fairly annoying but the alternatives seem worse. This fix allows getting rid of the use of get_parse_rowmark() in FDWs (as per the discussion around commits `462bd95705` and `8ec8760fc8`), and it simplifies some things elsewhere. Second, that the representation assumed that all child tables in an inheritance hierarchy would use the same RowMarkType. That's true today but will soon not be true. We add an "allMarkTypes" field that identifies the union of mark types used in all a parent table's children, and use that where appropriate (currently, only in preprocess_targetlist()). In passing fix a couple of minor infelicities left over from the SKIP LOCKED patch, notably that _outPlanRowMark still thought waitPolicy is a bool. Catversion bump is required because the numeric values of enum LockingClauseStrength can appear in on-disk rules. Extracted from a much larger patch to support foreign table inheritance; it seemed worth breaking this out, since it's a separable concern. Shigeru Hanada and Etsuro Fujita, somewhat modified by me	2015-03-15 18:41:47 -04:00
Peter Eisentraut	bb8582abf3	Remove rolcatupdate This role attribute is an ancient PostgreSQL feature, but could only be set by directly updating the system catalogs, and it doesn't have any clearly defined use. Author: Adam Brightwell <adam.brightwell@crunchydatasolutions.com>	2015-03-06 23:42:38 -05:00
Alvaro Herrera	a2e35b53c3	Change many routines to return ObjectAddress rather than OID The changed routines are mostly those that can be directly called by ProcessUtilitySlow; the intention is to make the affected object information more precise, in support for future event trigger changes. Originally it was envisioned that the OID of the affected object would be enough, and in most cases that is correct, but upon actually implementing the event trigger changes it turned out that ObjectAddress is more widely useful. Additionally, some command execution routines grew an output argument that's an object address which provides further info about the executed command. To wit: * for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of the new constraint * for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the schema that originally contained the object. * for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address of the object added to or dropped from the extension. There's no user-visible change in this commit, and no functional change either. Discussion: 20150218213255.GC6717@tamriel.snowman.net Reviewed-By: Stephen Frost, Andres Freund	2015-03-03 14:10:50 -03:00
Tom Lane	b67f1ce181	Reduce json <=> jsonb casts from explicit-only to assignment level. There's no reason to make users write an explicit cast to store a json value in a jsonb column or vice versa. We could probably even make these implicit, but that might open us up to problems with ambiguous function calls, so for now just do this.	2015-03-03 11:26:04 -05:00
Noah Misch	b8a18ad485	Add transform functions for AT TIME ZONE. This makes "ALTER TABLE tabname ALTER tscol TYPE ... USING tscol AT TIME ZONE 'UTC'" skip rewriting the table when altering from "timestamp" to "timestamptz" or vice versa. While it would be nicer still to optimize this in the absence of the USING clause given timezone==UTC, transform functions must consult IMMUTABLE facts only.	2015-03-01 13:22:34 -05:00
Tom Lane	c063da1769	Add parse location fields to NullTest and BooleanTest structs. We did not need a location tag on NullTest or BooleanTest before, because no error messages referred directly to their locations. That's planned to change though, so add these fields in a separate housekeeping commit. Catversion bump because stored rules may change.	2015-02-22 14:40:27 -05:00
Andres Freund	82a532b34d	Force some system catalog table columns to be marked NOT NULL. In a manual pass over the catalog declaration I found a number of columns which the boostrap automatism didn't mark NOT NULL even though they actually were. Add BKI_FORCE_NOT_NULL markings to them. It's usually not critical if a system table column is falsely determined to be nullable as the code should always catch relevant cases. But it's good to have a extra layer in place. Discussion: 20150215170014.GE15326@awork2.anarazel.de	2015-02-21 22:37:05 +01:00
Andres Freund	eb68379c38	Allow forcing nullness of columns during bootstrap. Bootstrap determines whether a column is null based on simple builtin rules. Those work surprisingly well, but nonetheless a few existing columns aren't set correctly. Additionally there is at least one patch sent to hackers where forcing the nullness of a column would be helpful. The boostrap format has gained FORCE [NOT] NULL for this, which will be emitted by genbki.pl when BKI_FORCE_(NOT_)?NULL is specified for a column in a catalog header. This patch doesn't change the marking of any existing columns. Discussion: 20150215170014.GE15326@awork2.anarazel.de	2015-02-21 22:31:54 +01:00
Tom Lane	692bd09ad1	Use "#ifdef CATALOG_VARLEN" to protect nullable fields of pg_authid. This gives a stronger guarantee than a mere comment against accessing these fields as simple struct members. Since rolpassword is in fact varlena, it's not clear why these didn't get marked from the beginning, but let's do it now. Michael Paquier	2015-02-20 00:23:48 -05:00
Tom Lane	09d8d110a6	Use FLEXIBLE_ARRAY_MEMBER in a bunch more places. Replace some bogus "x[1]" declarations with "x[FLEXIBLE_ARRAY_MEMBER]". Aside from being more self-documenting, this should help prevent bogus warnings from static code analyzers and perhaps compiler misoptimizations. This patch is just a down payment on eliminating the whole problem, but it gets rid of a lot of easy-to-fix cases. Note that the main problem with doing this is that one must no longer rely on computing sizeof(the containing struct), since the result would be compiler-dependent. Instead use offsetof(struct, lastfield). Autoconf also warns against spelling that offsetof(struct, lastfield[0]). Michael Paquier, review and additional fixes by me.	2015-02-20 00:11:42 -05:00
Tom Lane	2fb7a75f37	Add pg_stat_get_snapshot_timestamp() to show statistics snapshot timestamp. Per discussion, this could be useful for purposes such as programmatically detecting a nonresponding stats collector. We already have the timestamp anyway, it's just a matter of providing a SQL-accessible function to fetch it. Matt Kelly, reviewed by Jim Nasby	2015-02-19 21:36:50 -05:00
Tom Lane	56a79a869b	Split array_push into separate array_append and array_prepend functions. There wasn't any good reason for a single C function to implement both these SQL functions: it saved very little code overall, and it required significant pushups to re-determine at runtime which case applied. Redoing it as two functions ends up with just slightly more lines of code, but it's simpler to understand, and faster too because we need not repeat syscache lookups on every call. An important side benefit is that this eliminates the only case in which different aliases of the same C function had both anyarray and anyelement arguments at the same position, which would almost always be a mistake. The opr_sanity regression test will now notice such mistakes since there's no longer a valid case where it happens.	2015-02-18 20:53:33 -05:00
Heikki Linnakangas	c619c2351f	Move pg_crc.c to src/common, and remove pg_crc_tables.h To get CRC functionality in a client program, you now need to link with libpgcommon instead of libpgport. The CRC code has nothing to do with portability, so libpgcommon is a better home. (libpgcommon didn't exist when pg_crc.c was originally moved to src/port.) Remove the possibility to get CRC functionality by just #including pg_crc_tables.h. I'm not aware of any extensions that actually did that and couldn't simply link with libpgcommon. This also moves the pg_crc.h header file from src/include/utils to src/include/common, which will require changes to any external programs that currently does #include "utils/pg_crc.h". That seems acceptable, as include/common is clearly the right home for it now, and the change needed to any such programs is trivial.	2015-02-09 11:17:56 +02:00
Tom Lane	3d660d33aa	Fix assorted oversights in range selectivity estimation. calc_rangesel() failed outright when comparing range variables to empty constant ranges with < or >=, as a result of missing cases in a switch. It also produced a bogus estimate for > comparison to an empty range. On top of that, the >= and > cases were mislabeled throughout. For nonempty constant ranges, they managed to produce the right answers anyway as a result of counterbalancing typos. Also, default_range_selectivity() omitted cases for elem <@ range, range &< range, and range &> range, so that rather dubious defaults were applied for these operators. In passing, rearrange the code in rangesel() so that the elem <@ range case is handled in a less opaque fashion. Report and patch by Emre Hasegeli, some additional work by me	2015-01-30 12:30:59 -05:00
Stephen Frost	c7cf9a2433	Add usebypassrls to pg_user and pg_shadow The row level security patches didn't add the 'usebypassrls' columns to the pg_user and pg_shadow views on the belief that they were deprecated, but we havn't actually said they are and therefore we should include it. This patch corrects that, adds missing documentation for rolbypassrls into the system catalog page for pg_authid, along with the entries for pg_user and pg_shadow, and cleans up a few other uses of 'row-level' cases to be 'row level' in the docs. Pointed out by Amit Kapila. Catalog version bump due to system view changes.	2015-01-28 21:47:15 -05:00
Tom Lane	fd496129d1	Clean up some mess in row-security patches. Fix unsafe coding around PG_TRY in RelationBuildRowSecurity: can't change a variable inside PG_TRY and then use it in PG_CATCH without marking it "volatile". In this case though it seems saner to avoid that by doing a single assignment before entering the TRY block. I started out just intending to fix that, but the more I looked at the row-security code the more distressed I got. This patch also fixes incorrect construction of the RowSecurityPolicy cache entries (there was not sufficient care taken to copy pass-by-ref data into the cache memory context) and a whole bunch of sloppiness around the definition and use of pg_policy.polcmd. You can't use nulls in that column because initdb will mark it NOT NULL --- and I see no particular reason why a null entry would be a good idea anyway, so changing initdb's behavior is not the right answer. The internal value of '\0' wouldn't be suitable in a "char" column either, so after a bit of thought I settled on using '*' to represent ALL. Chasing those changes down also revealed that somebody wasn't paying attention to what the underlying values of ACL_UPDATE_CHR etc really were, and there was a great deal of lackadaiscalness in the catalogs.sgml documentation for pg_policy and pg_policies too. This doesn't pretend to be a complete code review for the row-security stuff, it just fixes the things that were in my face while dealing with the bugs in RelationBuildRowSecurity.	2015-01-24 16:16:22 -05:00
Alvaro Herrera	972bf7d6f1	Tweak BRIN minmax operator class In the union support proc, we were not checking the hasnulls flag of value A early enough, so it could be skipped if the "allnulls" flag in value B is set. Also, a check on the allnulls flag of value "B" was redundant, so remove it. Also change inet_minmax_ops to not be the default opclass for type inet, as a future inclusion operator class would be more useful and it's pretty difficult to change default opclass for a datatype later on. (There is no catversion bump for this catalog change; this shouldn't be a problem.) Extracted from a larger patch to add an "inclusion" operator class. Author: Emre Hasegeli	2015-01-22 17:01:09 -03:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Alvaro Herrera	72dd233d3e	pg_event_trigger_dropped_objects: Add name/args output columns These columns can be passed to pg_get_object_address() and used to reconstruct the dropped objects identities in a remote server containing similar objects, so that the drop can be replicated. Reviewed by Stephen Frost, Heikki Linnakangas, Abhijit Menon-Sen, Andres Freund.	2014-12-30 17:41:46 -03:00
Alvaro Herrera	a676201490	Add pg_identify_object_as_address This function returns object type and objname/objargs arrays, which can be passed to pg_get_object_address. This is especially useful because the textual representation can be copied to a remote server in order to obtain the corresponding OID-based address. In essence, this function is the inverse of recently added pg_get_object_address(). Catalog version bumped due to the addition of the new function. Also add docs to pg_get_object_address.	2014-12-30 15:41:50 -03:00
Alvaro Herrera	a609d96778	Revert "Use a bitmask to represent role attributes" This reverts commit `1826987a46`. The overall design was deemed unacceptable, in discussion following the previous commit message; we might find some parts of it still salvageable, but I don't want to be on the hook for fixing it, so let's wait until we have a new patch.	2014-12-23 15:35:49 -03:00
Alvaro Herrera	d7ee82e50f	Add SQL-callable pg_get_object_address This allows access to get_object_address from SQL, which is useful to obtain OID addressing information from data equivalent to that emitted by the parser. This is necessary infrastructure of a project to let replication systems propagate object dropping events to remote servers, where the schema might be different than the server originating the DROP. This patch also adds support for OBJECT_DEFAULT to get_object_address; that is, it is now possible to refer to a column's default value. Catalog version bumped due to the new function. Reviewed by Stephen Frost, Heikki Linnakangas, Robert Haas, Andres Freund, Abhijit Menon-Sen, Adam Brightwell.	2014-12-23 15:31:29 -03:00
Alvaro Herrera	1826987a46	Use a bitmask to represent role attributes The previous representation using a boolean column for each attribute would not scale as well as we want to add further attributes. Extra auxilliary functions are added to go along with this change, to make up for the lost convenience of access of the old representation. Catalog version bumped due to change in catalogs and the new functions. Author: Adam Brightwell, minor tweaks by Álvaro Reviewed by: Stephen Frost, Andres Freund, Álvaro Herrera	2014-12-23 10:22:09 -03:00
Alvaro Herrera	0ee98d1cbf	pg_event_trigger_dropped_objects: add behavior flags Add "normal" and "original" flags as output columns to the pg_event_trigger_dropped_objects() function. With this it's possible to distinguish which objects, among those listed, need to be explicitely referenced when trying to replicate a deletion. This is necessary so that the list of objects can be pruned to the minimum necessary to replicate the DROP command in a remote server that might have slightly different schema (for instance, TOAST tables and constraints with different names and such.) Catalog version bumped due to change of function definition. Reviewed by: Abhijit Menon-Sen, Stephen Frost, Heikki Linnakangas, Robert Haas.	2014-12-19 15:00:45 -03:00
Heikki Linnakangas	4520ba6769	Add point <-> polygon distance operator. Alexander Korotkov, reviewed by Emre Hasegeli.	2014-12-15 17:06:21 +02:00
Andrew Dunstan	7e354ab9fe	Add several generator functions for jsonb that exist for json. The functions are: to_jsonb() jsonb_object() jsonb_build_object() jsonb_build_array() jsonb_agg() jsonb_object_agg() Also along the way some better logic is implemented in json_categorize_type() to match that in the newly implemented jsonb_categorize_type(). Andrew Dunstan, reviewed by Pavel Stehule and Alvaro Herrera.	2014-12-12 15:31:14 -05:00
Andrew Dunstan	237a882443	Add json_strip_nulls and jsonb_strip_nulls functions. The functions remove object fields, including in nested objects, that have null as a value. In certain cases this can lead to considerably smaller datums, with no loss of semantic information. Andrew Dunstan, reviewed by Pavel Stehule.	2014-12-12 09:00:43 -05:00
Simon Riggs	618c9430a8	Event Trigger for table_rewrite Generate a table_rewrite event when ALTER TABLE attempts to rewrite a table. Provide helper functions to identify table and reason. Intended use case is to help assess or to react to schema changes that might hold exclusive locks for long periods. Dimitri Fontaine, triggering an edit by Simon Riggs Reviewed in detail by Michael Paquier	2014-12-08 00:55:28 +09:00
Alvaro Herrera	73c986adde	Keep track of transaction commit timestamps Transactions can now set their commit timestamp directly as they commit, or an external transaction commit timestamp can be fed from an outside system using the new function TransactionTreeSetCommitTsData(). This data is crash-safe, and truncated at Xid freeze point, same as pg_clog. This module is disabled by default because it causes a performance hit, but can be enabled in postgresql.conf requiring only a server restart. A new test in src/test/modules is included. Catalog version bumped due to the new subdirectory within PGDATA and a couple of new SQL functions. Authors: Álvaro Herrera and Petr Jelínek Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven Singer, Peter Eisentraut	2014-12-03 11:53:02 -03:00
Tom Lane	1511521a36	Minor cleanup of function declarations for BRIN. Get rid of PG_FUNCTION_INFO_V1() macros, which are quite inappropriate for built-in functions (possibly leftovers from testing as a loadable module?). Also, fix gratuitous inconsistency between SQL-level and C-level names of the minmax support functions.	2014-12-02 14:07:54 -05:00
Tom Lane	866737c923	Add a #define for the inet overlaps operator. Extracted from pending inet selectivity patch. The rest of it isn't quite ready to commit, but we might as well push this part so the patch doesn't have to track the moving target of pg_operator.h.	2014-11-30 19:43:43 -05:00
Alvaro Herrera	816e10d800	Fix BRIN operator family definitions The original definitions were leaving no room for cross-type operators, so queries that compared a column of one type against something of a different type were not taking advantage of the index. Fix by making the opfamilies more like the ones for Btree, and include a few cross-type operator classes. Catalog version bumped. Per complaints from Hubert Lubaczewski, Mark Wong, Heikki Linnakangas.	2014-11-28 18:09:19 -03:00
Stephen Frost	143b39c185	Rename pg_rowsecurity -> pg_policy and other fixes As pointed out by Robert, we should really have named pg_rowsecurity pg_policy, as the objects stored in that catalog are policies. This patch fixes that and updates the column names to start with 'pol' to match the new catalog name. The security consideration for COPY with row level security, also pointed out by Robert, has also been addressed by remembering and re-checking the OID of the relation initially referenced during COPY processing, to make sure it hasn't changed under us by the time we finish planning out the query which has been built. Robert and Alvaro also commented on missing OCLASS and OBJECT entries for POLICY (formerly ROWSECURITY or POLICY, depending) in various places. This patch fixes that too, which also happens to add the ability to COMMENT on policies. In passing, attempt to improve the consistency of messages, comments, and documentation as well. This removes various incarnations of 'row-security', 'row-level security', 'Row-security', etc, in favor of 'policy', 'row level security' or 'row_security' as appropriate. Happy Thanksgiving!	2014-11-27 01:15:57 -05:00
Tom Lane	bac27394a1	Support arrays as input to array_agg() and ARRAY(SELECT ...). These cases formerly failed with errors about "could not find array type for data type". Now they yield arrays of the same element type and one higher dimension. The implementation involves creating functions with API similar to the existing accumArrayResult() family. I (tgl) also extended the base family by adding an initArrayResult() function, which allows callers to avoid special-casing the zero-inputs case if they just want an empty array as result. (Not all do, so the previous calling convention remains valid.) This allowed simplifying some existing code in xml.c and plperl.c. Ali Akbar, reviewed by Pavel Stehule, significantly modified by me	2014-11-25 12:21:28 -05:00
Heikki Linnakangas	0bd624d63b	Distinguish XLOG_FPI records generated for hint-bit updates. Add a new XLOG_FPI_FOR_HINT record type, and use that for full-page images generated for hint bit updates, when checksums are enabled. The new record type is replayed exactly the same as XLOG_FPI, but allows them to be tallied separately e.g. in pg_xlogdump.	2014-11-24 11:09:08 +02:00
Tom Lane	c5111ea9ca	Remove no-longer-needed phony typedefs in genbki.h. Now that we have a policy of hiding varlena catalog fields behind "#ifdef CATALOG_VARLEN", there is no need for their type names to be acceptable to the C compiler. And experimentation shows that it does not matter to pgindent either. (If it did, we'd have problems anyway, since these typedefs are unreferenced so far as the C compiler is concerned, and find_typedef fails to identify such typedefs.) Hence, remove the phony typedefs that genbki.h provided to make some varlena field definitions compilable. In passing, rearrange #define's into what seemed a more logical order.	2014-11-20 13:16:14 -05:00
Heikki Linnakangas	2c03216d83	Revamp the WAL record format. Each WAL record now carries information about the modified relation and block(s) in a standardized format. That makes it easier to write tools that need that information, like pg_rewind, prefetching the blocks to speed up recovery, etc. There's a whole new API for building WAL records, replacing the XLogRecData chains used previously. The new API consists of XLogRegister* functions, which are called for each buffer and chunk of data that is added to the record. The new API also gives more control over when a full-page image is written, by passing flags to the XLogRegisterBuffer function. This also simplifies the XLogReadBufferForRedo() calls. The function can dig the relation and block number from the WAL record, so they no longer need to be passed as arguments. For the convenience of redo routines, XLogReader now disects each WAL record after reading it, copying the main data part and the per-block data into MAXALIGNed buffers. The data chunks are not aligned within the WAL record, but the redo routines can assume that the pointers returned by XLogRecGet* functions are. Redo routines are now passed the XLogReaderState, which contains the record in the already-disected format, instead of the plain XLogRecord. The new record format also makes the fixed size XLogRecord header smaller, by removing the xl_len field. The length of the "main data" portion is now stored at the end of the WAL record, and there's a separate header after XLogRecord for it. The alignment padding at the end of XLogRecord is also removed. This compansates for the fact that the new format would otherwise be more bulky than the old format. Reviewed by Andres Freund, Amit Kapila, Michael Paquier, Alvaro Herrera, Fujii Masao.	2014-11-20 18:46:41 +02:00
Alvaro Herrera	85b506bbfc	Get rid of SET LOGGED indexes persistence kludge This removes ATChangeIndexesPersistence() introduced by `f41872d0c1` which was too ugly to live for long. Instead, the correct persistence marking is passed all the way down to reindex_index, so that the transient relation built to contain the index relfilenode can get marked correctly right from the start. Author: Fabrízio de Royes Mello Review and editorialization by Michael Paquier and Álvaro Herrera	2014-11-15 01:19:49 -03:00
Fujii Masao	1871c89202	Add generate_series(numeric, numeric). Платон Малюгин Reviewed by Michael Paquier, Ali Akbar and Marti Raudsepp	2014-11-11 21:44:46 +09:00
Alvaro Herrera	7516f52594	BRIN: Block Range Indexes BRIN is a new index access method intended to accelerate scans of very large tables, without the maintenance overhead of btrees or other traditional indexes. They work by maintaining "summary" data about block ranges. Bitmap index scans work by reading each summary tuple and comparing them with the query quals; all pages in the range are returned in a lossy TID bitmap if the quals are consistent with the values in the summary tuple, otherwise not. Normal index scans are not supported because these indexes do not store TIDs. As new tuples are added into the index, the summary information is updated (if the block range in which the tuple is added is already summarized) or not; in the latter case, a subsequent pass of VACUUM or the brin_summarize_new_values() function will create the summary information. For data types with natural 1-D sort orders, the summary info consists of the maximum and the minimum values of each indexed column within each page range. This type of operator class we call "Minmax", and we supply a bunch of them for most data types with B-tree opclasses. Since the BRIN code is generalized, other approaches are possible for things such as arrays, geometric types, ranges, etc; even for things such as enum types we could do something different than minmax with better results. In this commit I only include minmax. Catalog version bumped due to new builtin catalog entries. There's more that could be done here, but this is a good step forwards. Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera, with contribution by Heikki Linnakangas. Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas. Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo. PS: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 318633.	2014-11-07 16:38:14 -03:00
Heikki Linnakangas	2076db2aea	Move the backup-block logic from XLogInsert to a new file, xloginsert.c. xlog.c is huge, this makes it a little bit smaller, which is nice. Functions related to putting together the WAL record are in xloginsert.c, and the lower level stuff for managing WAL buffers and such are in xlog.c. Also move the definition of XLogRecord to a separate header file. This causes churn in the #includes of all the files that write WAL records, and redo routines, but it avoids pulling in xlog.h into most places. Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.	2014-11-06 13:55:36 +02:00
Fujii Masao	08309aaf74	Implement IF NOT EXIST for CREATE INDEX. Fabrízio de Royes Mello, reviewed by Marti Raudsepp, Adam Brightwell and me.	2014-11-06 18:48:33 +09:00
Heikki Linnakangas	5028f22f6e	Switch to CRC-32C in WAL and other places. The old algorithm was found to not be the usual CRC-32 algorithm, used by Ethernet et al. We were using a non-reflected lookup table with code meant for a reflected lookup table. That's a strange combination that AFAICS does not correspond to any bit-wise CRC calculation, which makes it difficult to reason about its properties. Although it has worked well in practice, seems safer to use a well-known algorithm. Since we're changing the algorithm anyway, we might as well choose a different polynomial. The Castagnoli polynomial has better error-correcting properties than the traditional CRC-32 polynomial, even if we had implemented it correctly. Another reason for picking that is that some new CPUs have hardware support for calculating CRC-32C, but not CRC-32, let alone our strange variant of it. This patch doesn't add any support for such hardware, but a future patch could now do that. The old algorithm is kept around for tsquery and pg_trgm, which use the values in indexes that need to remain compatible so that pg_upgrade works. While we're at it, share the old lookup table for CRC-32 calculation between hstore, ltree and core. They all use the same table, so might as well.	2014-11-04 11:39:48 +02:00
Andrew Dunstan	af2b8fd057	Correct volatility markings of a few json functions. json_agg and json_object_agg and their associated transition functions should have been marked as stable rather than immutable, as they call IO functions indirectly. Changing this probably isn't going to make much difference, as you can't use an aggregate function in an index expression, but we should be correct nevertheless. json_object, on the other hand, should be marked immutable rather than stable, as it does not call IO functions. As discussed on -hackers, this change is being made without bumping the catalog version, as we don't want to do that at this stage of the cycle, and the changes are very unlikely to affect anyone.	2014-10-20 15:31:05 -04:00
Alvaro Herrera	df630b0dd5	Implement SKIP LOCKED for row-level locks This clause changes the behavior of SELECT locking clauses in the presence of locked rows: instead of causing a process to block waiting for the locks held by other processes (or raise an error, with NOWAIT), SKIP LOCKED makes the new reader skip over such rows. While this is not appropriate behavior for general purposes, there are some cases in which it is useful, such as queue-like tables. Catalog version bumped because this patch changes the representation of stored rules. Reviewed by Craig Ringer (based on a previous attempt at an implementation by Simon Riggs, who also provided input on the syntax used in the current patch), David Rowley, and Álvaro Herrera. Author: Thomas Munro	2014-10-07 17:23:34 -03:00
Stephen Frost	c8a026e4f1	Revert `95d737ff` to add 'ignore_nulls' Per discussion, revert the commit which added 'ignore_nulls' to row_to_json. This capability would be better added as an independent function rather than being bolted on to row_to_json. Additionally, the implementation didn't address complex JSON objects, and so was incomplete anyway. Pointed out by Tom and discussed with Andrew and Robert.	2014-09-29 13:32:22 -04:00
Tom Lane	def4c28cf9	Change JSONB's on-disk format for improved performance. The original design used an array of offsets into the variable-length portion of a JSONB container. However, such an array is basically uncompressible by simple compression techniques such as TOAST's LZ compressor. That's bad enough, but because the offset array is at the front, it tended to trigger the give-up-after-1KB heuristic in the TOAST code, so that the entire JSONB object was stored uncompressed; which was the root cause of bug #11109 from Larry White. To fix without losing the ability to extract a random array element in O(1) time, change this scheme so that most of the JEntry array elements hold lengths rather than offsets. With data that's compressible at all, there tend to be fewer distinct element lengths, so that there is scope for compression of the JEntry array. Every N'th entry is still an offset. To determine the length or offset of any specific element, we might have to examine up to N preceding JEntrys, but that's still O(1) so far as the total container size is concerned. Testing shows that this cost is negligible compared to other costs of accessing a JSONB field, and that the method does largely fix the incompressible-data problem. While at it, rearrange the order of elements in a JSONB object so that it's "all the keys, then all the values" not alternating keys and values. This doesn't really make much difference right at the moment, but it will allow providing a fast path for extracting individual object fields from large JSONB values stored EXTERNAL (ie, uncompressed), analogously to the existing optimization for substring extraction from large EXTERNAL text values. Bump catversion to denote the incompatibility in on-disk format. We will need to fix pg_upgrade to disallow upgrading jsonb data stored with 9.4 betas 1 and 2. Heikki Linnakangas and Tom Lane	2014-09-29 12:29:21 -04:00
Stephen Frost	6550b901fe	Code review for row security. Buildfarm member tick identified an issue where the policies in the relcache for a relation were were being replaced underneath a running query, leading to segfaults while processing the policies to be added to a query. Similar to how TupleDesc RuleLocks are handled, add in a equalRSDesc() function to check if the policies have actually changed and, if not, swap back the rsdesc field (using the original instead of the temporairly built one; the whole structure is swapped and then specific fields swapped back). This now passes a CLOBBER_CACHE_ALWAYS for me and should resolve the buildfarm error. In addition to addressing this, add a new chapter in Data Definition under Privileges which explains row security and provides examples of its usage, change \d to always list policies (even if row security is disabled- but note that it is disabled, or enabled with no policies), rework check_role_for_policy (it really didn't need the entire policy, but it did need to be using has_privs_of_role()), and change the field in pg_class to relrowsecurity from relhasrowsecurity, based on Heikki's suggestion. Also from Heikki, only issue SET ROW_SECURITY in pg_restore when talking to a 9.5+ server, list Bypass RLS in \du, and document --enable-row-security options for pg_dump and pg_restore. Lastly, fix a number of minor whitespace and typo issues from Heikki, Dimitri, add a missing #include, per Peter E, fix a few minor variable-assigned-but-not-used and resource leak issues from Coverity and add tab completion for role attribute bypassrls as well.	2014-09-24 16:32:22 -04:00
Andrew Dunstan	b1a52872ae	Fix typos in descriptions of json_object functions.	2014-09-24 11:24:42 -04:00
Stephen Frost	491c029dbc	Row-Level Security Policies (RLS) Building on the updatable security-barrier views work, add the ability to define policies on tables to limit the set of rows which are returned from a query and which are allowed to be added to a table. Expressions defined by the policy for filtering are added to the security barrier quals of the query, while expressions defined to check records being added to a table are added to the with-check options of the query. New top-level commands are CREATE/ALTER/DROP POLICY and are controlled by the table owner. Row Security is able to be enabled and disabled by the owner on a per-table basis using ALTER TABLE .. ENABLE/DISABLE ROW SECURITY. Per discussion, ROW SECURITY is disabled on tables by default and must be enabled for policies on the table to be used. If no policies exist on a table with ROW SECURITY enabled, a default-deny policy is used and no records will be visible. By default, row security is applied at all times except for the table owner and the superuser. A new GUC, row_security, is added which can be set to ON, OFF, or FORCE. When set to FORCE, row security will be applied even for the table owner and superusers. When set to OFF, row security will be disabled when allowed and an error will be thrown if the user does not have rights to bypass row security. Per discussion, pg_dump sets row_security = OFF by default to ensure that exports and backups will have all data in the table or will error if there are insufficient privileges to bypass row security. A new option has been added to pg_dump, --enable-row-security, to ask pg_dump to export with row security enabled. A new role capability, BYPASSRLS, which can only be set by the superuser, is added to allow other users to be able to bypass row security using row_security = OFF. Many thanks to the various individuals who have helped with the design, particularly Robert Haas for his feedback. Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean Rasheed, with additional changes and rework by me. Reviewers have included all of the above, Greg Smith, Jeff McCormick, and Robert Haas.	2014-09-19 11:18:35 -04:00
Andres Freund	728f152e07	Add rmgr callback to name xlog record types for display purposes. This is primarily useful for the upcoming pg_xlogdump --stats feature, but also allows to remove some duplicated code in the rmgr_desc routines. Due to the separation and harmonization, the output of dipsplayed records changes somewhat. But since this isn't enduser oriented content that's ok. It's potentially desirable to further change pg_xlogdump's display of records. It previously wasn't possible to show the record type separately from the description forcing it to be in the last column. But that's better done in a separate commit. Author: Abhijit Menon-Sen, slightly editorialized by me Reviewed-By: Álvaro Herrera, Andres Freund, and Heikki Linnakangas Discussion: 20140604104716.GA3989@toroid.org	2014-09-19 16:20:29 +02:00
Heikki Linnakangas	77e65bf369	Fix the return type of GIN triConsistent support functions to "char". They were marked to return a boolean, but they actually return a GinTernaryValue, which is more like a "char". It makes no practical difference, as the triConsistent functions cannot be called directly from SQL because they have "internal" arguments, but this nevertheless seems more correct. Also fix the GinTernaryValue name in the documentation. I renamed the enum earlier, but neglected the docs. Alexander Korotkov. This is new in 9.4, so backpatch there.	2014-09-16 09:22:33 +03:00
Stephen Frost	95d737ff45	Add 'ignore_nulls' option to row_to_json Provide an option to skip NULL values in a row when generating a JSON object from that row with row_to_json. This can reduce the size of the JSON object in cases where columns are NULL without really reducing the information in the JSON object. This also makes row_to_json into a single function with default values, rather than having multiple functions. In passing, change array_to_json to also be a single function with default values (we don't add an 'ignore_nulls' option yet- it's not clear that there is a sensible use-case there, and it hasn't been asked for in any case). Pavel Stehule	2014-09-11 21:23:51 -04:00
Bruce Momjian	36ad1a87a3	Implement mxid_age() to compute multi-xid age Report by Josh Berkus	2014-09-10 17:13:04 -04:00
Tom Lane	e80252d424	Add width_bucket(anyelement, anyarray). This provides a convenient method of classifying input values into buckets that are not necessarily equal-width. It works on any sortable data type. The choice of function name is a bit debatable, perhaps, but showing that there's a relationship to the SQL standard's width_bucket() function seems more attractive than the other proposals. Petr Jelinek, reviewed by Pavel Stehule	2014-09-09 15:34:14 -04:00
Andres Freund	5a64cb740d	Fix s/pluggins/plugins/ typo in two comments. Michael Paquier	2014-09-01 12:01:29 +02:00
Bruce Momjian	d5d7d07765	Again update C comments for pg_attribute.attislocal	2014-08-30 10:25:11 -04:00
Bruce Momjian	c6eaa880ee	Update C comment for pg_attribute.attislocal Indicates if column has ever been local/non-inherited	2014-08-29 19:01:04 -04:00
Tom Lane	6c40f8316e	Add min and max aggregates for inet/cidr data types. Haribabu Kommi, reviewed by Muhammad Asif Naeem	2014-08-28 22:37:58 -04:00
Bruce Momjian	73fe87503f	rename macro isTempOrToastNamespace to isTempOrTempToastNamespace Done for clarity	2014-08-25 21:28:19 -04:00
Tom Lane	e3f9c16838	Fix bogus commutator/negator links for JSONB containment operators. <@ and @> are each other's commutators, but they were incorrectly marked as being each other's negators instead. (This was actually questioned in a comment in the original commit, but nobody followed through :-(.) Per bug #11178 from Christian Pronovost. In passing, fix some JSONB operator descriptions that were randomly different from the phrasing of every other similar description. catversion bump for pg_catalog contents change.	2014-08-16 12:53:54 -04:00
Robert Haas	b34e37bfef	Add sortsupport routines for text. This provides a small but worthwhile speedup when sorting text, at least in cases to which the sortsupport machinery applies. Robert Haas and Peter Geoghegan	2014-08-14 12:09:52 -04:00
Bruce Momjian	4c6780fd17	pg_upgrade: prevent oid conflicts with new-cluster TOAST tables Previously, TOAST tables only required in the new cluster could cause oid conflicts if they were auto-numbered and a later conflicting oid had to be assigned. Backpatch through 9.3	2014-08-07 14:56:13 -04:00
Andrew Dunstan	0f43a55331	json_build_object and json_build_array are stable, not immutable. These functions indirectly invoke output functions, so they can't be immutable. Backpatch to 9.4 where they were introduced. Catalog version bumped.	2014-07-15 14:24:47 -04:00
Andres Freund	a36a8fa376	Rename logical decoding's pg_llog directory to pg_logical. The old name wasn't very descriptive as of actual contents of the directory, which are historical snapshots in the snapshots/ subdirectory and mappingdata for rewritten tuples in mappings/. There's been a fair amount of discussion what would be a good name. I'm settling for pg_logical because it's likely that further data around logical decoding and replication will need saving in the future. Also add the missing entry for the directory into storage.sgml's list of PGDATA contents. Bumps catversion as the data directories won't be compatible.	2014-07-02 21:07:47 +02:00
Tom Lane	a749a23d7a	Remove use_json_as_text options from json_to_record/json_populate_record. The "false" case was really quite useless since all it did was to throw an error; a definition not helped in the least by making it the default. Instead let's just have the "true" case, which emits nested objects and arrays in JSON syntax. We might later want to provide the ability to emit sub-objects in Postgres record or array syntax, but we'd be best off to drive that off a check of the target field datatype, not a separate argument. For the functions newly added in 9.4, we can just remove the flag arguments outright. We can't do that for json_populate_record[set], which already existed in 9.3, but we can ignore the argument and always behave as if it were "true". It helps that the flag arguments were optional and not documented in any useful fashion anyway.	2014-06-29 13:50:58 -04:00
Tom Lane	f71136eeeb	Get rid of bogus separate pg_proc entries for json_extract_path operators. These should not have existed to begin with, but there was apparently some misunderstanding of the purpose of the opr_sanity regression test item that checks for operator implementation functions with their own comments. The idea there is to check for unintentional violations of the rule that operator implementation functions shouldn't be documented separately .... but for these functions, that is in fact what we want, since the variadic option is useful and not accessible via the operator syntax. Get rid of the extra pg_proc entries and fix the regression test and documentation to be explicit about what we're doing here.	2014-06-26 16:22:15 -07:00
Tom Lane	8b38a538c0	Add Asserts to verify that catalog cache keys are unique and not null. The catcache code is effectively assuming this already, so let's insist that the catalog and index are actually declared that way. Having done that, the comments in indexing.h about non-unique indexes not being used for catcaches are completely redundant not just mostly so; and we didn't have such a comment for every such index anyway. So let's get rid of them. Per discussion of whether we should identify primary keys for catalogs. We might or might not take that further step, but this change in itself will allow quicker detection of misdeclared catcaches, so it seems worth doing in any case.	2014-06-20 18:21:05 -04:00
Tom Lane	8f889b1083	Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ... This SQL-standard feature allows a sub-SELECT yielding multiple columns (but only one row) to be used to compute the new values of several columns to be updated. While the same results can be had with an independent sub-SELECT per column, such a workaround can require a great deal of duplicated computation. The standard actually says that the source for a multi-column assignment could be any row-valued expression. The implementation used here is tightly tied to our existing sub-SELECT support and can't handle other cases; the Bison grammar would have some issues with them too. However, I don't feel too bad about this since other cases can be converted into sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".	2014-06-18 13:22:34 -04:00
Heikki Linnakangas	0ef0b6784c	Change the signature of rm_desc so that it's passed a XLogRecord. Just feels more natural, and is more consistent with rm_redo.	2014-06-14 10:46:48 +03:00
Tom Lane	154146d208	Rename lo_create(oid, bytea) to lo_from_bytea(). The previous naming broke the query that libpq's lo_initialize() uses to collect the OIDs of the server-side functions it requires, because that query effectively assumes that there is only one function named lo_create in the pg_catalog schema (and likewise only one lo_open, etc). While we should certainly make libpq more robust about this, the naive query will remain in use in the field for the foreseeable future, so it seems the only workable choice is to use a different name for the new function. lo_from_bytea() won a small straw poll. Back-patch into 9.4 where the new function was introduced.	2014-06-12 15:39:09 -04:00
Tom Lane	5f93c37805	Add defenses against running with a wrong selection of LOBLKSIZE. It's critical that the backend's idea of LOBLKSIZE match the way data has actually been divided up in pg_largeobject. While we don't provide any direct way to adjust that value, doing so is a one-line source code change and various people have expressed interest recently in changing it. So, just as with TOAST_MAX_CHUNK_SIZE, it seems prudent to record the value in pg_control and cross-check that the backend's compiled-in setting matches the on-disk data. Also tweak the code in inv_api.c so that fetches from pg_largeobject explicitly verify that the length of the data field is not more than LOBLKSIZE. Formerly we just had Asserts() for that, which is no protection at all in production builds. In some of the call sites an overlength data value would translate directly to a security-relevant stack clobber, so it seems worth one extra runtime comparison to be sure. In the back branches, we can't change the contents of pg_control; but we can still make the extra checks in inv_api.c, which will offer some amount of protection against running with the wrong value of LOBLKSIZE.	2014-06-05 11:31:06 -04:00
Andres Freund	f0c108560b	Consistently spell a replication slot's name as slot_name. Previously there's been a mix between 'slotname' and 'slot_name'. It's not nice to be unneccessarily inconsistent in a new feature. As a post beta1 initdb now is required in the wake of `eeca4cd35e`, fix the inconsistencies. Most the changes won't affect usage of replication slots because the majority of changes is around function parameter names. The prominent exception to that is that the recovery.conf parameter 'primary_slotname' is now named 'primary_slot_name'.	2014-06-05 16:29:20 +02:00
Tom Lane	4c8ab1b91d	Add btree and hash opclasses for pg_lsn. This is needed to allow ORDER BY, DISTINCT, etc to work as expected for pg_lsn values. We had previously decided to put this off for 9.5, but in view of commit `eeca4cd35e` there's no reason to avoid a catversion bump for 9.4beta2, and this does make a pretty significant usability difference for pg_lsn. Michael Paquier, with fixes from Andres Freund and Tom Lane	2014-06-04 20:45:56 -04:00
Tom Lane	eeca4cd35e	Bump PG_CONTROL_VERSION for previous 9.4 changes. This should have been done in `6bc8ef0b7f` and/or `50e547096c`, but better late than never. If we don't change this then we risk 9.3 pg_controldata or pg_resetxlog being inappropriately used against a 9.4 pg_control file, or vice versa.	2014-06-04 18:16:17 -04:00
Tom Lane	ec3357a3bc	pg_lsn should not be marked typispreferred. In general it's not a good idea for built-in types in the 'U' category to be marked preferred; they could draw behavior away from user-defined types with similarly-named operators. pg_lsn is probably at low risk of that right now given the lack of casts between it and other types, but that doesn't make this marking OK. Ordinarily we'd bump catversion when changing any predefined catalog contents like this, but since we're past beta1, the costs of a forced initdb seem to outweigh the benefits of guaranteed behavioral consistency. There's not any known behavioral impact today anyway --- this is more in the nature of being sure there's not problems in future. Per an off-list complaint from Thomas Fanghaenel.	2014-05-28 00:26:46 -04:00
Tom Lane	12e611d43e	Rename jsonb_hash_ops to jsonb_path_ops. There's no longer much pressure to switch the default GIN opclass for jsonb, but there was still some unhappiness with the name "jsonb_hash_ops", since hashing is no longer a distinguishing property of that opclass, and anyway it seems like a relatively minor detail. At the suggestion of Heikki Linnakangas, we'll use "jsonb_path_ops" instead; that captures the important characteristic that each index entry depends on the entire path from the document root to the indexed value. Also add a user-facing explanation of the implementation properties of these two opclasses.	2014-05-11 12:06:04 -04:00
Tom Lane	bdf9dd4db7	Fix typcategory labeling of jsonb. Dunno who had the cute idea of labeling jsonb as typcategory 'C', but it is not a composite type. Label it 'U', since that's what json is using.	2014-05-09 09:25:58 -04:00
Heikki Linnakangas	d9daff0e0c	More jsonb cleanup. Fix JSONB_MAX_ELEMS and JSONB_MAX_PAIRS macros to use CB_MASK in the calculation. JENTRY_POSMASK happens to have the same value at the moment, but that's just coincidental. Refactor jsonb iterator functions, for readability. Get rid of the JENTRY_ISFIRST flag. Whenever we handle JEntrys, we have access to the whole array and have enough context information to know which entry is the first. This frees up one bit in the JEntry header for future use. While we're at it, shuffle the JEntry bits so that boolean true and false go together, for aesthetic reasons. Bump catalog version as this changes the on-disk format slightly.	2014-05-09 15:55:56 +03:00
Tom Lane	46dddf7673	Improve key representation for GIN jsonb_ops, and fix existence-search bug. Change the key representation so that values that would exceed 127 bytes are hashed into short strings, and so that the original JSON datatype of each value is recorded in the index. The hashing rule eliminates the major objection to having this opclass be the default for jsonb, namely that it could fail for plausible input data (due to GIN's restrictions on maximum key length). Preserving datatype information doesn't really buy us much right now, but it requires no extra space compared to the previous way, and it might be useful later. Also, change the consistency-checking functions to request recheck for exists (jsonb ? text) and related operators. The original analysis that this is an exactly checkable query was incorrect, since the index does not preserve information about whether a key appears at top level in the indexed JSON object. Add a test case demonstrating the problem. Make some other, mostly cosmetic improvements to the code in jsonb_gin.c as well. catversion bump due to on-disk data format change in jsonb_ops indexes.	2014-05-09 08:41:26 -04:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Tom Lane	3727afafee	Fix pg_type.typlen for newly-revived line type. Commit `261c7d4b65` removed the "m" field from struct LINE, but neglected to make pg_type.h's idea of the type's size match. This resulted in reading past the end of palloc'd LINE values when inserting them into tuples etc. In principle that could cause a SIGSEGV, though the odds of detectable problems seem low. Bump catversion since this makes an incompatible on-disk format change. Note that if the line type had been in use in the field, this would break pg_upgrade'ability of databases containing line values; but it seems unlikely that there are any (they'd have had to be compiled with -DENABLE_LINE_TYPE). Spotted by Andres Freund.	2014-05-05 13:37:54 -04:00
Tom Lane	2d00190495	Rationalize common/relpath.[hc]. Commit `a730183926` created rather a mess by putting dependencies on backend-only include files into include/common. We really shouldn't do that. To clean it up: * Move TABLESPACE_VERSION_DIRECTORY back to its longtime home in catalog/catalog.h. We won't consider this symbol part of the FE/BE API. * Push enum ForkNumber from relfilenode.h into relpath.h. We'll consider relpath.h as the source of truth for fork numbers, since relpath.c was already partially serving that function, and anyway relfilenode.h was kind of a random place for that enum. * So, relfilenode.h now includes relpath.h rather than vice-versa. This direction of dependency is fine. (That allows most, but not quite all, of the existing explicit #includes of relpath.h to go away again.) * Push forkname_to_number from catalog.c to relpath.c, just to centralize fork number stuff a bit better. * Push GetDatabasePath from catalog.c to relpath.c; it was rather odd that the previous commit didn't keep this together with relpath(). * To avoid needing relfilenode.h in common/, redefine the underlying function (now called GetRelationPath) as taking separate OID arguments, and make the APIs using RelFileNode or RelFileNodeBackend into macro wrappers. (The macros have a potential multiple-eval risk, but none of the existing call sites have an issue with that; one of them had such a risk already anyway.) * Fix failure to follow the directions when "init" fork type was added; specifically, the errhint in forkname_to_number wasn't updated, and neither was the SGML documentation for pg_relation_size(). * Fix tablespace-path-too-long check in CreateTableSpace() to account for fork-name component of maximum-length pathnames. This requires putting FORKNAMECHARS into a header file, but it was rather useless (and actually unreferenced) where it was. The last couple of items are potentially back-patchable bug fixes, if anyone is sufficiently excited about them; but personally I'm not. Per a gripe from Christoph Berg about how include/common wasn't self-contained.	2014-04-30 17:30:50 -04:00
Tom Lane	a0f9358149	Fix incorrect pg_proc.proallargtypes entries for two built-in functions. pg_sequence_parameters() and pg_identify_object() have had incorrect proallargtypes entries since 9.1 and 9.3 respectively. This was mostly masked by the correct information in proargtypes, but a few operations such as pg_get_function_arguments() (and thus psql's \df display) would show the wrong data types for these functions' input parameters. In HEAD, fix the wrong info, bump catversion, and add an opr_sanity regression test to catch future mistakes of this sort. In the back branches, just fix the wrong info so that installations initdb'd with future minor releases will have the right data. We can't force an initdb, and it doesn't seem like a good idea to add a regression test that will fail on existing installations. Andres Freund	2014-04-23 21:21:05 -04:00
Tom Lane	f0fedfe82c	Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.	2014-04-23 19:17:41 -04:00
Robert Haas	dfc0219f64	Add to_regprocedure() and to_regoperator(). These are natural complements to the functions added by commit `0886fc6a5c`, but they weren't included in the original patch for some reason. Add them. Patch by me, per a complaint by Tom Lane. Review by Tatsuo Ishii.	2014-04-16 12:21:43 -04:00
Tom Lane	d95425c8b9	Provide moving-aggregate support for boolean aggregates. David Rowley and Florian Pflug, reviewed by Dean Rasheed	2014-04-13 00:01:46 -04:00
Tom Lane	9d229f399e	Provide moving-aggregate support for a bunch of numerical aggregates. First installment of the promised moving-aggregate support in built-in aggregates: count(), sum(), avg(), stddev() and variance() for assorted datatypes, though not for float4/float8. In passing, remove a 2001-vintage kluge in interval_accum(): interval array elements have been properly aligned since around 2003, but nobody remembered to take out this workaround. Also, fix a thinko in the opr_sanity tests for moving-aggregate catalog entries. David Rowley and Florian Pflug, reviewed by Dean Rasheed	2014-04-12 20:33:09 -04:00
Tom Lane	a9d9acbf21	Create infrastructure for moving-aggregate optimization. Until now, when executing an aggregate function as a window function within a window with moving frame start (that is, any frame start mode except UNBOUNDED PRECEDING), we had to recalculate the aggregate from scratch each time the frame head moved. This patch allows an aggregate definition to include an alternate "moving aggregate" implementation that includes an inverse transition function for removing rows from the aggregate's running state. As long as this can be done successfully, runtime is proportional to the total number of input rows, rather than to the number of input rows times the average frame length. This commit includes the core infrastructure, documentation, and regression tests using user-defined aggregates. Follow-on commits will update some of the built-in aggregates to use this feature. David Rowley and Florian Pflug, reviewed by Dean Rasheed; additional hacking by me	2014-04-12 12:03:30 -04:00
Tom Lane	f23a5630eb	Add an in-core GiST index opclass for inet/cidr types. This operator class can accelerate subnet/supernet tests as well as btree-equivalent ordered comparisons. It also handles a new network operator inet && inet (overlaps, a/k/a "is supernet or subnet of"), which is expected to be useful in exclusion constraints. Ideally this opclass would be the default for GiST with inet/cidr data, but we can't mark it that way until we figure out how to do a more or less graceful transition from the current situation, in which the really-completely-bogus inet/cidr opclasses in contrib/btree_gist are marked as default. Having the opclass in core and not default is better than not having it at all, though. While at it, add new documentation sections to allow us to officially document GiST/GIN/SP-GiST opclasses, something there was never a clear place to do before. I filled these in with some simple tables listing the existing opclasses and the operators they support, but there's certainly scope to put more information there. Emre Hasegeli, reviewed by Andreas Karlsson, further hacking by me	2014-04-08 15:46:43 -04:00
Robert Haas	0886fc6a5c	Add new to_reg* functions for error-free OID lookups. These functions won't throw an error if the object doesn't exist, or if (for functions and operators) there's more than one matching object. Yugo Nagata and Nozomi Anzai, reviewed by Amit Khandekar, Marti Raudsepp, Amit Kapila, and me.	2014-04-08 10:27:56 -04:00
Simon Riggs	e5550d5fec	Reduce lock levels of some ALTER TABLE cmds VALIDATE CONSTRAINT CLUSTER ON SET WITHOUT CLUSTER ALTER COLUMN SET STATISTICS ALTER COLUMN SET () ALTER COLUMN RESET () All other sub-commands use AccessExclusiveLock Simon Riggs and Noah Misch Reviews by Robert Haas and Andres Freund	2014-04-06 11:13:43 -04:00
Robert Haas	59202fae04	Fix some compiler warnings that clang emits with -pedantic. Andres Freund	2014-04-04 11:29:50 -04:00
Tom Lane	c7b3539599	Fix non-equivalence of VARIADIC and non-VARIADIC function call formats. For variadic functions (other than VARIADIC ANY), the syntaxes foo(x,y,...) and foo(VARIADIC ARRAY[x,y,...]) should be considered equivalent, since the former is converted to the latter at parse time. They have indeed been equivalent, in all releases before 9.3. However, commit `75b39e790` made an ill-considered decision to record which syntax had been used in FuncExpr nodes, and then to make equal() test that in checking node equality --- which caused the syntaxes to not be seen as equivalent by the planner. This is the underlying cause of bug #9817 from Dmitry Ryabov. It might seem that a quick fix would be to make equal() disregard FuncExpr.funcvariadic, but the same commit made that untenable, because the field actually is semantically significant for some VARIADIC ANY functions. This patch instead adopts the approach of redefining funcvariadic (and aggvariadic, in HEAD) as meaning that the last argument is a variadic array, whether it got that way by parser intervention or was supplied explicitly by the user. Therefore the value will always be true for non-ANY variadic functions, restoring the principle of equivalence. (However, the planner will continue to consider use of VARIADIC as a meaningful difference for VARIADIC ANY functions, even though some such functions might disregard it.) In HEAD, this change lets us simplify the decompilation logic in ruleutils.c, since the funcvariadic/aggvariadic flag tells directly whether to print VARIADIC. However, in 9.3 we have to continue to cope with existing stored rules/views that might contain the previous definition. Fortunately, this just means no change in ruleutils.c, since its existing behavior effectively ignores funcvariadic for all cases other than VARIADIC ANY functions. In HEAD, bump catversion to reflect the fact that FuncExpr.funcvariadic changed meanings; this is sort of pro forma, since I don't believe any built-in views are affected. Unfortunately, this patch doesn't magically fix everything for affected 9.3 users. After installing 9.3.5, they might need to recreate their rules/views/indexes containing variadic function calls in order to get everything consistent with the new definition. As in the cited bug, the symptom of a problem would be failure to use a nominally matching index that has a variadic function call in its definition. We'll need to mention this in the 9.3.5 release notes.	2014-04-03 22:02:24 -04:00
Andrew Dunstan	f9c6d72cbf	Cleanup around json_to_record/json_to_recordset Set function parameter names and defaults. Add jsonb versions (which the code already provided for so the actual new code is trivial). Add jsonb regression tests and docs. Bump catalog version (which I apparently forgot to do when jsonb was committed).	2014-03-26 10:18:24 -04:00
Andrew Dunstan	d9134d0a35	Introduce jsonb, a structured format for storing json. The new format accepts exactly the same data as the json type. However, it is stored in a format that does not require reparsing the orgiginal text in order to process it, making it much more suitable for indexing and other operations. Insignificant whitespace is discarded, and the order of object keys is not preserved. Neither are duplicate object keys kept - the later value for a given key is the only one stored. The new type has all the functions and operators that the json type has, with the exception of the json generation functions (to_json, json_agg etc.) and with identical semantics. In addition, there are operator classes for hash and btree indexing, and two classes for GIN indexing, that have no equivalent in the json type. This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which was intended to provide similar facilities to a nested hstore type, but which in the end proved to have some significant compatibility issues. Authors: Oleg Bartunov, Teodor Sigaev, Peter Geoghegan and Andrew Dunstan. Review: Andres Freund	2014-03-23 16:40:19 -04:00
Heikki Linnakangas	c5608ea26a	Allow opclasses to provide tri-valued GIN consistent functions. With the GIN "fast scan" feature, GIN can skip items without fetching all the keys for them, if it can prove that they don't match regardless of those keys. So far, it has done the proving by calling the boolean consistent function with all combinations of TRUE/FALSE for the unfetched keys, but since that's O(n^2), it becomes unfeasible with more than a few keys. We can avoid calling consistent with all the combinations, if we can tell the operator class implementation directly which keys are unknown. This commit includes a triConsistent function for the built-in array and tsvector opclasses. Alexander Korotkov, with some changes by me.	2014-03-12 17:51:30 +02:00
Alvaro Herrera	84df54b22e	Constructors for interval, timestamp, timestamptz Author: Pavel Stěhule, editorialized somewhat by Álvaro Herrera Reviewed-by: Tomáš Vondra, Marko Tiikkaja With input from Fabrízio de Royes Mello, Jim Nasby	2014-03-04 15:09:43 -03:00
Robert Haas	b89e151054	Introduce logical decoding. This feature, building on previous commits, allows the write-ahead log stream to be decoded into a series of logical changes; that is, inserts, updates, and deletes and the transactions which contain them. It is capable of handling decoding even across changes to the schema of the effected tables. The output format is controlled by a so-called "output plugin"; an example is included. To make use of this in a real replication system, the output plugin will need to be modified to produce output in the format appropriate to that system, and to perform filtering. Currently, information can be extracted from the logical decoding system only via SQL; future commits will add the ability to stream changes via walsender. Andres Freund, with review and other contributions from many other people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan, Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve Singer.	2014-03-03 16:32:18 -05:00
Robert Haas	a8e9b86b5e	Bump catversion. The previous patch should have entailed a catversion bump, but I forgot.	2014-03-03 07:22:20 -05:00
Robert Haas	d83ee62231	Corrections to replication slots code and documentation. Andres Freund, per a report from Vik Faering	2014-03-03 07:16:54 -05:00
Robert Haas	ae95f5f74a	Define LSNOID in pg_type.h. Most other built-in types have a similarly-named constant, so this type should probably have one, too. Michael Paquier	2014-03-03 07:03:41 -05:00
Robert Haas	dd1a3bccca	Show xid and xmin in pg_stat_activity and pg_stat_replication. Christian Kruse, reviewed by Andres Freund and myself, with further minor adjustments by me.	2014-02-25 12:34:04 -05:00
Robert Haas	6615e77439	Use pg_lsn data type in pg_stat_replication, too. Michael Paquier, per a suggestion from Andres Freund	2014-02-24 10:38:45 -05:00
Robert Haas	6f289c2b7d	Switch various builtin functions to use pg_lsn instead of text. The functions in slotfuncs.c don't exist in any released version, but the changes to xlogfuncs.c represent backward-incompatibilities. Per discussion, we're hoping that the queries using these functions are few enough and simple enough that this won't cause too much breakage for users. Michael Paquier, reviewed by Andres Freund and further modified by me.	2014-02-19 11:37:43 -05:00
Robert Haas	694e3d139a	Further code review for pg_lsn data type. Change input function error messages to be more consistent with what is done elsewhere. Remove a bunch of redundant type casts, so that the compiler will warn us if we screw up. Don't pass LSNs by value on platforms where a Datum is only 32 bytes, per buildfarm. Move macros for packing and unpacking LSNs to pg_lsn.h so that we can include access/xlogdefs.h, to avoid an unsatisfied dependency on XLogRecPtr.	2014-02-19 10:06:59 -05:00
Robert Haas	7d03a83f4d	Add a pg_lsn data type, to represent an LSN. Robert Haas and Michael Paquier	2014-02-19 08:35:23 -05:00
Robert Haas	5f173040e3	Avoid repeated name lookups during table and index DDL. If the name lookups come to different conclusions due to concurrent activity, we might perform some parts of the DDL on a different table than other parts. At least in the case of CREATE INDEX, this can be used to cause the permissions checks to be performed against a different table than the index creation, allowing for a privilege escalation attack. This changes the calling convention for DefineIndex, CreateTrigger, transformIndexStmt, transformAlterTableStmt, CheckIndexCompatible (in 9.2 and newer), and AlterTable (in 9.1 and older). In addition, CheckRelationOwnership is removed in 9.2 and newer and the calling convention is changed in older branches. A field has also been added to the Constraint node (FkConstraint in 8.4). Third-party code calling these functions or using the Constraint node will require updating. Report by Andres Freund. Patch by Robert Haas and Andres Freund, reviewed by Tom Lane. Security: CVE-2014-0062	2014-02-17 09:33:31 -05:00
Robert Haas	80353f3528	Adjust pg_sleep_for/pg_sleep_until to use clock_timestamp. Otherwise, pg_sleep_until does the wrong thing in a multi-statement transaction. Julien Rouhaud	2014-02-03 14:33:43 -05:00
Robert Haas	858ec11858	Introduce replication slots. Replication slots are a crash-safe data structure which can be created on either a master or a standby to prevent premature removal of write-ahead log segments needed by a standby, as well as (with hot_standby_feedback=on) pruning of tuples whose removal would cause replication conflicts. Slots have some advantages over existing techniques, as explained in the documentation. In a few places, we refer to the type of replication slots introduced by this patch as "physical" slots, because forthcoming patches for logical decoding will also have slots, but with somewhat different properties. Andres Freund and Robert Haas	2014-01-31 22:45:36 -05:00
Bruce Momjian	fc4ffba968	system catalogs: reorder pg_amproc entries into proper sections Report form Antonin Houska	2014-01-31 16:04:18 -05:00
Robert Haas	760c770ff6	Add convenience functions pg_sleep_for and pg_sleep_until. Vik Fearing, reviewed by Pavel Stehule and myself	2014-01-30 15:47:56 -05:00
Andrew Dunstan	5e52e9d6d4	Forgot to bump catalog version for json_array_elements_text.	2014-01-29 16:38:31 -05:00
Andrew Dunstan	5264d91541	Add json_array_elements_text function. This was a notable omission from the json functions added in 9.3 and there have been numerous complaints about its absence. Laurence Rowe.	2014-01-29 15:39:01 -05:00
Andrew Dunstan	105639900b	New json functions. json_build_array() and json_build_object allow for the construction of arbitrarily complex json trees. json_object() turns a one or two dimensional array, or two separate arrays, into a json_object of name/value pairs, similarly to the hstore() function. json_object_agg() aggregates its two arguments into a single json object as name value pairs. Catalog version bumped. Andrew Dunstan, reviewed by Marko Tiikkaja.	2014-01-28 17:48:21 -05:00
Fujii Masao	9132b189bf	Add pg_stat_archiver statistics view. This view shows the statistics about the WAL archiver process's activity. Gabriele Bartolini, reviewed by Michael Paquier, refactored a bit by me.	2014-01-29 02:58:22 +09:00
Alvaro Herrera	b152c6cd0d	Make DROP IF EXISTS more consistently not fail Some cases were still reporting errors and aborting, instead of a NOTICE that the object was being skipped. This makes it more difficult to cleanly handle pg_dump --clean, so change that to instead skip missing objects properly. Per bug #7873 reported by Dave Rolsky; apparently this affects a large number of users. Authors: Pavel Stehule and Dean Rasheed. Some tweaks by Álvaro Herrera	2014-01-23 14:40:29 -03:00
Robert Haas	01f7808b3e	Add a cardinality function for arrays. Unlike our other array functions, this considers the total number of elements across all dimensions, and returns 0 rather than NULL when the array has no elements. But it seems that both of those behaviors are almost universally disliked, so hopefully that's OK. Marko Tiikkaja, reviewed by Dean Rasheed and Pavel Stehule	2014-01-21 12:38:53 -05:00
Andrew Dunstan	11829ff8b2	Remove DESCR entries for json operator functions. Per -hackers discussion.	2014-01-10 22:25:04 -05:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Tom Lane	8b49a6044d	Cache catalog lookup data across groups in ordered-set aggregates. The initial commit of ordered-set aggregates just did all the setup work afresh each time the aggregate function is started up. But in a GROUP BY query, the catalog lookups need not be repeated for each group, since the column datatypes and sort information won't change. When there are many small groups, this makes for a useful, though not huge, performance improvement. Per suggestion from Andrew Gierth. Profiling of these cases suggests that it might be profitable to avoid duplicate lookups within tuplesort startup as well; but changing the tuplesort APIs would have much broader impact, so I left that for another day.	2014-01-05 12:28:39 -05:00
Peter Eisentraut	a09e3fd776	Fix whitespace	2013-12-26 23:51:56 -05:00
Tom Lane	8d65da1f01	Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane	2013-12-23 16:11:35 -05:00
Fujii Masao	961bf59fb7	Rename wal_log_hintbits to wal_log_hints, per discussion on pgsql-hackers. Sawada Masahiko	2013-12-21 03:33:16 +09:00
Bruce Momjian	527fdd9df1	Move pg_upgrade_support global variables to their own include file Previously their declarations were spread around to avoid accidental access.	2013-12-19 16:10:07 -05:00
Heikki Linnakangas	50e547096c	Add GUC to enable WAL-logging of hint bits, even with checksums disabled. WAL records of hint bit updates is useful to tools that want to examine which pages have been modified. In particular, this is required to make the pg_rewind tool safe (without checksums). This can also be used to test how much extra WAL-logging would occur if you enabled checksums, without actually enabling them (which you can't currently do without re-initdb'ing). Sawada Masahiko, docs by Samrat Revagade. Reviewed by Dilip Kumar, with further changes by me.	2013-12-13 16:26:14 +02:00
Robert Haas	8e18d04d4d	Refine our definition of what constitutes a system relation. Although user-defined relations can't be directly created in pg_catalog, it's possible for them to end up there, because you can create them in some other schema and then use ALTER TABLE .. SET SCHEMA to move them there. Previously, such relations couldn't afterwards be manipulated, because IsSystemRelation()/IsSystemClass() rejected all attempts to modify objects in the pg_catalog schema, regardless of their origin. With this patch, they now reject only those objects in pg_catalog which were created at initdb-time, allowing most operations on user-created tables in pg_catalog to proceed normally. This patch also adds new functions IsCatalogRelation() and IsCatalogClass(), which is similar to IsSystemRelation() and IsSystemClass() but with a slightly narrower definition: only TOAST tables of system catalogs are included, rather than all TOAST tables. This is currently used only for making decisions about when invalidation messages need to be sent, but upcoming logical decoding patches will find other uses for this information. Andres Freund, with some modifications by me.	2013-11-28 20:57:20 -05:00
Peter Eisentraut	85ed91ee7d	Implement information_schema.parameters.parameter_default column Reviewed-by: Ali Dar <ali.munir.dar@gmail.com> Reviewed-by: Amit Khandekar <amit.khandekar@enterprisedb.com> Reviewed-by: Rodolfo Campero <rodolfo.campero@anachronics.com>	2013-11-26 23:21:35 -05:00
Tom Lane	45e02e3232	Fix array slicing of int2vector and oidvector values. The previous coding labeled expressions such as pg_index.indkey[1:3] as being of int2vector type; which is not right because the subscript bounds of such a result don't, in general, satisfy the restrictions of int2vector. To fix, implicitly promote the result of slicing int2vector to int2[], or oidvector to oid[]. This is similar to what we've done with domains over arrays, which is a good analogy because these types are very much like restricted domains of the corresponding regular-array types. A side-effect is that we now also forbid array-element updates on such columns, eg while "update pg_index set indkey[4] = 42" would have worked before if you were superuser (and corrupted your catalogs irretrievably, no doubt) it's now disallowed. This seems like a good thing since, again, some choices of subscripting would've led to results not satisfying the restrictions of int2vector. The case of an array-slice update was rejected before, though with a different error message than you get now. We could make these cases work in future if we added a cast from int2[] to int2vector (with a cast function checking the subscript restrictions) but it seems unlikely that there's any value in that. Per report from Ronan Dunklau. Back-patch to all supported branches because of the crash risks involved.	2013-11-23 20:03:56 -05:00
Tom Lane	784e762e88	Support multi-argument UNNEST(), and TABLE() syntax for multiple functions. This patch adds the ability to write TABLE( function1(), function2(), ...) as a single FROM-clause entry. The result is the concatenation of the first row from each function, followed by the second row from each function, etc; with NULLs inserted if any function produces fewer rows than others. This is believed to be a much more useful behavior than what Postgres currently does with multiple SRFs in a SELECT list. This syntax also provides a reasonable way to combine use of column definition lists with WITH ORDINALITY: put the column definition list inside TABLE(), where it's clear that it doesn't control the ordinality column as well. Also implement SQL-compliant multiple-argument UNNEST(), by turning UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)). The SQL standard specifies TABLE() with only a single function, not multiple functions, and it seems to require an implicit UNNEST() which is not what this patch does. There may be something wrong with that reading of the spec, though, because if it's right then the spec's TABLE() is just a pointless alternative spelling of UNNEST(). After further review of that, we might choose to adopt a different syntax for what this patch does, but in any case this functionality seems clearly worthwhile. Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and significantly revised by me	2013-11-21 19:37:20 -05:00
Tom Lane	f901bb50e3	Add make_date() and make_time() functions. Pavel Stehule, reviewed by Jeevan Chalke and Atri Sharma	2013-11-17 15:06:50 -05:00
Tom Lane	69c8fbac20	Improve performance of numeric sum(), avg(), stddev(), variance(), etc. This patch improves performance of most built-in aggregates that formerly used a NUMERIC or NUMERIC array as their transition type; this includes not only aggregates on numeric inputs, but some aggregates on integer inputs where overflow of an int8 value is a possibility. The code now uses a special-purpose data structure to avoid array construction and deconstruction overhead, as well as packing and unpacking overhead for numeric values. These aggregates' transition type is now declared as INTERNAL, since it doesn't correspond to any SQL data type. To keep the planner from thinking that that means a lot of storage will be used, we make use of the just-added pg_aggregate.aggtransspace feature. The space estimate is set to 128 bytes, which is at least in the right ballpark. Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra	2013-11-16 18:46:34 -05:00
Tom Lane	6cb86143e8	Allow aggregates to provide estimates of their transition state data size. Formerly the planner had a hard-wired rule of thumb for guessing the amount of space consumed by an aggregate function's transition state data. This estimate is critical to deciding whether it's OK to use hash aggregation, and in many situations the built-in estimate isn't very good. This patch adds a column to pg_aggregate wherein a per-aggregate estimate can be provided, overriding the planner's default, and infrastructure for setting the column via CREATE AGGREGATE. It may be that additional smarts will be required in future, perhaps even a per-aggregate estimation function. But this is already a step forward. This is extracted from a larger patch to improve the performance of numeric and int8 aggregates. I (tgl) thought it was worth reviewing and committing this infrastructure separately. In this commit, all built-in aggregates are given aggtransspace = 0, so no behavior should change. Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra	2013-11-16 16:03:40 -05:00
Robert Haas	07cacba983	Add the notion of REPLICA IDENTITY for a table. Pending patches for logical replication will use this to determine which columns of a tuple ought to be considered as its candidate key. Andres Freund, with minor, mostly cosmetic adjustments by me	2013-11-08 12:30:43 -05:00
Noah Misch	c50b7c09d8	Add large object functions catering to SQL callers. With these, one need no longer manipulate large object descriptors and extract numeric constants from header files in order to read and write large object contents from SQL. Pavel Stehule, reviewed by Rushabh Lathia.	2013-10-27 22:56:54 -04:00
Peter Eisentraut	3dc543b3d8	Replace duplicate_oids with Perl implementation It is more portable, more robust, and more readable. From: Andrew Dunstan <andrew@dunslane.net>	2013-10-10 20:09:42 -04:00
Andrew Dunstan	4d212bac17	json_typeof function. Andrew Tipton.	2013-10-10 12:21:59 -04:00
Peter Eisentraut	261c7d4b65	Revive line type Change the input/output format to {A,B,C}, to match the internal representation. Complete the implementations of line_in, line_out, line_recv, line_send. Remove comments and error messages about the line type not being implemented. Add regression tests for existing line operators and functions. Reviewed-by: rui hua <365507506hua@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com>	2013-10-09 22:34:38 -04:00
Kevin Grittner	f566515192	Add record_image_ops opclass for matview concurrent refresh. REFRESH MATERIALIZED VIEW CONCURRENTLY was broken for any matview containing a column of a type without a default btree operator class. It also did not produce results consistent with a non- concurrent REFRESH or a normal view if any column was of a type which allowed user-visible differences between values which compared as equal according to the type's default btree opclass. Concurrent matview refresh was modified to use the new operators to solve these problems. Documentation was added for record comparison, both for the default btree operator class for record, and the newly added operators. Regression tests now check for proper behavior both for a matview with a box column and a matview containing a citext column. Reviewed by Steve Singer, who suggested some of the doc language.	2013-10-09 14:26:09 -05:00
Kevin Grittner	277607d600	Eliminate pg_rewrite.ev_attr column and related dead code. Commit `95ef6a3448` removed the ability to create rules on an individual column as of 7.3, but left some residual code which has since been useless. This cleans up that dead code without any change in behavior other than dropping the useless column from the catalog.	2013-09-05 14:03:43 -05:00
Tom Lane	0d3f4406df	Allow aggregate functions to be VARIADIC. There's no inherent reason why an aggregate function can't be variadic (even VARIADIC ANY) if its transition function can handle the case. Indeed, this patch to add the feature touches none of the planner or executor, and little of the parser; the main missing stuff was DDL and pg_dump support. It is true that variadic aggregates can create the same sort of ambiguity about parameters versus ORDER BY keys that was complained of when we (briefly) had both one- and two-argument forms of string_agg(). However, the policy formed in response to that discussion only said that we'd not create any built-in aggregates with varying numbers of arguments, not that we shouldn't allow users to do it. So the logical extension of that is we can allow users to make variadic aggregates as long as we're wary about shipping any such in core. In passing, this patch allows aggregate function arguments to be named, to the extent of remembering the names in pg_proc and dumping them in pg_dump. You can't yet call an aggregate using named-parameter notation. That seems like a likely future extension, but it'll take some work, and it's not what this patch is really about. Likewise, there's still some work needed to make window functions handle VARIADIC fully, but I left that for another day. initdb forced because of new aggvariadic field in Aggref parse nodes.	2013-09-03 17:08:46 -04:00
Robert Haas	f01d1ae3a1	Add infrastructure for mapping relfilenodes to relation OIDs. Future patches are expected to introduce logical replication that works by decoding WAL. WAL contains relfilenodes rather than relation OIDs, so this infrastructure will be needed to find the relation OID based on WAL contents. If logical replication does not make it into this release, we probably should consider reverting this, since it will add some overhead to DDL operations that create new relations. One additional index insert per pg_class row is not a large overhead, but it's more than zero. Another way of meeting the needs of logical replication would be to the relation OID to WAL, but that would burden DML operations, not only DDL. Andres Freund, with some changes by me. Design review, in earlier versions, by Álvaro Herrera.	2013-07-22 11:09:10 -04:00
Stephen Frost	4cbe3ac3e8	WITH CHECK OPTION support for auto-updatable VIEWs For simple views which are automatically updatable, this patch allows the user to specify what level of checking should be done on records being inserted or updated. For 'LOCAL CHECK', new tuples are validated against the conditionals of the view they are being inserted into, while for 'CASCADED CHECK' the new tuples are validated against the conditionals for all views involved (from the top down). This option is part of the SQL specification. Dean Rasheed, reviewed by Pavel Stehule	2013-07-18 17:10:16 -04:00
Noah Misch	b560ec1b0d	Implement the FILTER clause for aggregate function calls. This is SQL-standard with a few extensions, namely support for subqueries and outer references in clause expressions. catversion bump due to change in Aggref and WindowFunc. David Fetter, reviewed by Dean Rasheed.	2013-07-16 20:15:36 -04:00
Magnus Hagander	c87ff71f37	Expose the estimation of number of changed tuples since last analyze This value, now pg_stat_all_tables.n_mod_since_analyze, was already tracked and used by autovacuum, but not exposed to the user. Mark Kirkwood, review by Laurenz Albe	2013-07-05 15:10:15 +02:00

... 4 5 6 7 8 ...

2110 Commits