postgresql

Commit Graph

Author	SHA1	Message	Date
Peter Eisentraut	8b6d6cf853	Remove objname/objargs split for referring to objects In simpler times, it might have worked to refer to all kinds of objects by a list of name components and an optional argument list. But this doesn't work for all objects, which has resulted in a collection of hacks to place various other nodes types into these fields, which have to be unpacked at the other end. This makes it also weird to represent lists of such things in the grammar, because they would have to be lists of singleton lists, to make the unpacking work consistently. The other problem is that keeping separate name and args fields makes it awkward to deal with lists of functions. Change that by dropping the objargs field and have objname, renamed to object, be a generic Node, which can then be flexibly assigned and managed using the normal Node mechanisms. In many cases it will still be a List of names, in some cases it will be a string Value, for types it will be the existing Typename, for functions it will now use the existing ObjectWithArgs node type. Some of the more obscure object types still use somewhat arbitrary nested lists. Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-03-06 13:31:47 -05:00
Peter Eisentraut	272adf4f9c	Disallow CREATE/DROP SUBSCRIPTION in transaction block Disallow CREATE SUBSCRIPTION and DROP SUBSCRIPTION in a transaction block when the replication slot is to be created or dropped, since that cannot be rolled back. based on patch by Masahiko Sawada <sawada.mshk@gmail.com>	2017-03-03 23:29:13 -05:00
Peter Eisentraut	6d16ecc646	Add CREATE COLLATION IF NOT EXISTS clause The core of the functionality was already implemented when pg_import_system_collations was added. This just exposes it as an option in the SQL command.	2017-02-15 10:01:28 -05:00
Robert Haas	a507b86900	Add WAL consistency checking facility. When the new GUC wal_consistency_checking is set to a non-empty value, it triggers recording of additional full-page images, which are compared on the standby against the results of applying the WAL record (without regard to those full-page images). Allowable differences such as hints are masked out, and the resulting pages are compared; any difference results in a FATAL error on the standby. Kuntal Ghosh, based on earlier patches by Michael Paquier and Heikki Linnakangas. Extensively reviewed and revised by Michael Paquier and by me, with additional reviews and comments from Amit Kapila, Álvaro Herrera, Simon Riggs, and Peter Eisentraut.	2017-02-08 15:45:30 -05:00
Heikki Linnakangas	dbd69118c0	Replace isMD5() with a more future-proof way to check if pw is encrypted. The rule is that if pg_authid.rolpassword begins with "md5" and has the right length, it's an MD5 hash, otherwise it's a plaintext password. The idiom has been to use isMD5() to check for that, but that gets awkward, when we add new kinds of verifiers, like the verifiers for SCRAM authentication in the pending SCRAM patch set. Replace isMD5() with a new get_password_type() function, so that when new verifier types are added, we don't need to remember to modify every place that currently calls isMD5(), to also recognize the new kinds of verifiers. Also, use the new plain_crypt_verify function in passwordcheck, so that it doesn't need to know about MD5, or in the future, about other kinds of hashes or password verifiers. Reviewed by Michael Paquier and Peter Eisentraut. Discussion: https://www.postgresql.org/message-id/2d07165c-1793-e243-a2a9-e45b624c7580@iki.fi	2017-02-01 13:11:37 +02:00
Peter Eisentraut	3d9e73ea5f	Update copyright years in some recently added files	2017-01-25 12:32:05 -05:00
Robert Haas	27cdb3414b	Reindent table partitioning code. We've accumulated quite a bit of stuff with which pgindent is not quite happy in this code; clean it up to provide a less-annoying base for future pgindent runs.	2017-01-24 10:20:02 -05:00
Peter Eisentraut	b480086760	Add more includes so header files are self-contained	2017-01-21 15:49:53 -05:00
Peter Eisentraut	665d1fad99	Logical replication - Add PUBLICATION catalogs and DDL - Add SUBSCRIPTION catalog and DDL - Define logical replication protocol and output plugin - Add logical replication workers From: Petr Jelinek <petr@2ndquadrant.com> Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Erik Rijkers <er@xs4all.nl> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>	2017-01-20 09:04:49 -05:00
Peter Eisentraut	352a24a1f9	Generate fmgr prototypes automatically Gen_fmgrtab.pl creates a new file fmgrprotos.h, which contains prototypes for all functions registered in pg_proc.h. This avoids having to manually maintain these prototypes across a random variety of header files. It also automatically enforces a correct function signature, and since there are warnings about missing prototypes, it will detect functions that are defined but not registered in pg_proc.h (or otherwise used). Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>	2017-01-17 14:06:07 -05:00
Tom Lane	ab1f0c8225	Change representation of statement lists, and add statement location info. This patch makes several changes that improve the consistency of representation of lists of statements. It's always been the case that the output of parse analysis is a list of Query nodes, whatever the types of the individual statements in the list. This patch brings similar consistency to the outputs of raw parsing and planning steps: * The output of raw parsing is now always a list of RawStmt nodes; the statement-type-dependent nodes are one level down from that. * The output of pg_plan_queries() is now always a list of PlannedStmt nodes, even for utility statements. In the case of a utility statement, "planning" just consists of wrapping a CMD_UTILITY PlannedStmt around the utility node. This list representation is now used in Portal and CachedPlan plan lists, replacing the former convention of intermixing PlannedStmts with bare utility-statement nodes. Now, every list of statements has a consistent head-node type depending on how far along it is in processing. This allows changing many places that formerly used generic "Node *" pointers to use a more specific pointer type, thus reducing the number of IsA() tests and casts needed, as well as improving code clarity. Also, the post-parse-analysis representation of DECLARE CURSOR is changed so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained SELECT remains a child of the DeclareCursorStmt rather than getting flipped around to be the other way. It's now true for both Query and PlannedStmt that utilityStmt is non-null if and only if commandType is CMD_UTILITY. That allows simplifying a lot of places that were testing both fields. (I think some of those were just defensive programming, but in many places, it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.) Because PlannedStmt carries a canSetTag field, we're also able to get rid of some ad-hoc rules about how to reconstruct canSetTag for a bare utility statement; specifically, the assumption that a utility is canSetTag if and only if it's the only one in its list. While I see no near-term need for relaxing that restriction, it's nice to get rid of the ad-hocery. The API of ProcessUtility() is changed so that what it's passed is the wrapper PlannedStmt not just the bare utility statement. This will affect all users of ProcessUtility_hook, but the changes are pretty trivial; see the affected contrib modules for examples of the minimum change needed. (Most compilers should give pointer-type-mismatch warnings for uncorrected code.) There's also a change in the API of ExplainOneQuery_hook, to pass through cursorOptions instead of expecting hook functions to know what to pick. This is needed because of the DECLARE CURSOR changes, but really should have been done in 9.6; it's unlikely that any extant hook functions know about using CURSOR_OPT_PARALLEL_OK. Finally, teach gram.y to save statement boundary locations in RawStmt nodes, and pass those through to Query and PlannedStmt nodes. This allows more intelligent handling of cases where a source query string contains multiple statements. This patch doesn't actually do anything with the information, but a follow-on patch will. (Passing this information through cleanly is the true motivation for these changes; while I think this is all good cleanup, it's unlikely we'd have bothered without this end goal.) catversion bump because addition of location fields to struct Query affects stored rules. This patch is by me, but it owes a good deal to Fabien Coelho who did a lot of preliminary work on the problem, and also reviewed the patch. Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre	2017-01-14 16:02:35 -05:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Peter Eisentraut	1753b1b027	Add pg_sequence system catalog Move sequence metadata (start, increment, etc.) into a proper system catalog instead of storing it in the sequence heap object. This separates the metadata from the sequence data. Sequence metadata is now operated on transactionally by DDL commands, whereas previously rollbacks of sequence-related DDL commands would be ignored. Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2016-12-20 08:28:18 -05:00
Robert Haas	f0e44751d7	Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.	2016-12-07 13:17:55 -05:00
Tom Lane	b3427dade1	Delete deleteWhatDependsOn() in favor of more performDeletion() flag bits. deleteWhatDependsOn() had grown an uncomfortably large number of assumptions about what it's used for. There are actually only two minor differences between what it does and what a regular performDeletion() call can do, so let's invent additional bits in performDeletion's existing flags argument that specify those behaviors, and get rid of deleteWhatDependsOn() as such. (We'd probably have done it this way from the start, except that performDeletion didn't originally have a flags argument, IIRC.) Also, add a SKIP_EXTENSIONS flag bit that prevents ever recursing to an extension, and use that when dropping temporary objects at session end. This provides a more general solution to the problem addressed in a hacky way in commit 08dd23cec: if an extension script creates temp objects and forgets to remove them again, the whole extension went away when its contained temp objects were deleted. The previous solution only covered temp relations, but this solves it for all object types. These changes require minor additions in dependency.c to pass the flags to subroutines that previously didn't get them, but it's still a net savings of code, and it seems cleaner than before. Having done this, revert the special-case code added in `08dd23cec` that prevented addition of pg_depend records for temp table extension membership, because that caused its own oddities: dropping an extension that had created such a table didn't automatically remove the table, leading to a failure if the table had another dependency on the extension (such as use of an extension data type), or to a duplicate-name failure if you then tried to recreate the extension. But we keep the part that prevents the pg_temp_nnn schema from becoming an extension member; we never want that to happen. Add a regression test case covering these behaviors. Although this fixes some arguable bugs, we've heard few field complaints, and any such problems are easily worked around by explicitly dropping temp objects at the end of extension scripts (which seems like good practice anyway). So I won't risk a back-patch. Discussion: https://postgr.es/m/e51f4311-f483-4dd0-1ccc-abec3c405110@BlueTreble.com	2016-12-02 14:57:55 -05:00
Peter Eisentraut	67dc4ccbb2	Add pg_sequences view Like pg_tables, pg_views, and others, this view contains information about sequences in a way that is independent of the system catalog layout but more comprehensive than the information schema. To help implement the view, add a new internal function pg_sequence_last_value() to return the last value of a sequence. This is kept separate from pg_sequence_parameters() to separate querying run-time state from catalog-like information. Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2016-11-18 14:59:03 -05:00
Kevin Grittner	8c48375e5f	Implement syntax for transition tables in AFTER triggers. This is infrastructure for the complete SQL standard feature. No support is included at this point for execution nodes or PLs. The intent is to add that soon. As this patch leaves things, standard syntax can create tuplestores to contain old and/or new versions of rows affected by a statement. References to these tuplestores are in the TriggerData structure. C triggers can access the tuplestores directly, so they are usable, but they cannot yet be referenced within a SQL statement.	2016-11-04 10:49:50 -05:00
Heikki Linnakangas	babe05bc2b	Turn password_encryption GUC into an enum. This makes the parameter easier to extend, to support other password-based authentication protocols than MD5. (SCRAM is being worked on.) The GUC still accepts on/off as aliases for "md5" and "plain", although we may want to remove those once we actually add support for another password hash type. Michael Paquier, reviewed by David Steele, with some further edits by me. Discussion: <CAB7nPqSMXU35g=W9X74HVeQp0uvgJxvYOuA4A-A3M+0wfEBv-w@mail.gmail.com>	2016-09-28 12:22:44 +03:00
Peter Eisentraut	49eb0fd097	Add location field to DefElem Add a location field to the DefElem struct, used to parse many utility commands. Update various error messages to supply error position information. To propogate the error position information in a more systematic way, create a ParseState in standard_ProcessUtility() and pass that to interested functions implementing the utility commands. This seems better than passing the query string and then reassembling a parse state ad hoc, which violates the encapsulation of the ParseState type. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>	2016-09-06 12:00:00 -04:00
Tom Lane	15bc038f9b	Relax transactional restrictions on ALTER TYPE ... ADD VALUE. To prevent possibly breaking indexes on enum columns, we must keep uncommitted enum values from getting stored in tables, unless we can be sure that any such column is new in the current transaction. Formerly, we enforced this by disallowing ALTER TYPE ... ADD VALUE from being executed at all in a transaction block, unless the target enum type had been created in the current transaction. This patch removes that restriction, and instead insists that an uncommitted enum value can't be referenced unless it belongs to an enum type created in the same transaction as the value. Per discussion, this should be a bit less onerous. It does require each function that could possibly return a new enum value to SQL operations to check this restriction, but there aren't so many of those that this seems unmaintainable. Andrew Dunstan and Tom Lane Discussion: <4075.1459088427@sss.pgh.pa.us>	2016-09-05 12:59:55 -04:00
Tom Lane	b5bce6c1ec	Final pgindent + perltidy run for 9.6.	2016-08-15 13:42:51 -04:00
Tom Lane	4d042999f9	Print a given subplan only once in EXPLAIN. We have, for a very long time, allowed the same subplan (same member of the PlannedStmt.subplans list) to be referenced by more than one SubPlan node; this avoids problems for cases such as subplans within an IndexScan's indxqual and indxqualorig fields. However, EXPLAIN had not gotten the memo and would print each reference as though it were an independent identical subplan. To fix, track plan_ids of subplans we've printed and don't print the same plan_id twice. Per report from Pavel Stehule. BTW: the particular case of IndexScan didn't cause visible duplication in a plain EXPLAIN, only EXPLAIN ANALYZE, because in the former case we short-circuit executor startup before the indxqual field is processed by ExecInitExpr. That seems like it could easily lead to other EXPLAIN problems in future, but it's not clear how to avoid it without breaking the "EXPLAIN a plan using hypothetical indexes" use-case. For now I've left that issue alone. Although this is a longstanding bug, it's purely cosmetic (no great harm is done by the repeat printout) and we haven't had field complaints before. So I'm hesitant to back-patch it, especially since there is some small risk of ABI problems due to the need to add a new field to ExplainState. In passing, rearrange order of fields in ExplainState to be less random, and update some obsolete comments about when/where to initialize them. Report: <CAFj8pRAimq+NK-menjt+3J4-LFoodDD8Or6=Lc_stcFD+eD4DA@mail.gmail.com>	2016-07-11 18:14:29 -04:00
Robert Haas	10c0558ffe	Fix several mistakes around parallel workers and client_encoding. Previously, workers sent data to the leader using the client encoding. That mostly worked, but the leader the converted the data back to the server encoding. Since not all encoding conversions are reversible, that could provoke failures. Fix by using the database encoding for all communication between worker and leader. Also, while temporary changes to GUC settings, as from the SET clause of a function, are in general OK for parallel query, changing client_encoding this way inside of a parallel worker is not OK. Previously, that would have confused the leader; with these changes, it would not confuse the leader, but it wouldn't do anything either. So refuse such changes in parallel workers. Also, the previous code naively assumed that when it received a NotifyResonse from the worker, it could pass that directly back to the user. But now that worker-to-leader communication always uses the database encoding, that's clearly no longer correct - though, actually, the old way was always broken for V2 clients. So disassemble and reconstitute the message instead. Issues reported by Peter Eisentraut. Patch by me, reviewed by Peter Eisentraut.	2016-06-30 18:35:32 -04:00
Tom Lane	8ebb69f854	Fix some infelicities in EXPLAIN output for parallel query plans. In non-text output formats, parallelized aggregates were reporting "Partial" or "Finalize" as a field named "Operation", which might be all right in the absence of any context --- but other plan node types use that field to report SQL-visible semantics, such as Select/Insert/Update/Delete. So that naming choice didn't seem good to me. I changed it to "Partial Mode". Also, the field did not appear at all for a non-parallelized Agg plan node, which is contrary to expectation in non-text formats. We're notionally producing objects that conform to a schema, so the set of fields for a given node type and EXPLAIN mode should be well-defined. I set it up to fill in "Simple" in such cases. Other fields that were added for parallel query, namely "Parallel Aware" and Gather's "Single Copy", had not gotten the word on that point either. Make them appear always in non-text output. Also, the latter two fields were nominally producing boolean output, but were getting it wrong, because bool values shouldn't be quoted in JSON or YAML. Somehow we'd not needed an ExplainPropertyBool formatting subroutine before 9.6; but now we do, so invent it. Discussion: <16002.1466972724@sss.pgh.pa.us>	2016-06-29 18:51:20 -04:00
Alvaro Herrera	f2fcad27d5	Support ALTER THING .. DEPENDS ON EXTENSION This introduces a new dependency type which marks an object as depending on an extension, such that if the extension is dropped, the object automatically goes away; and also, if the database is dumped, the object is included in the dump output. Currently the grammar supports this for indexes, triggers, materialized views and functions only, although the utility code is generic so adding support for more object types is a matter of touching the parser rules only. Author: Abhijit Menon-Sen Reviewed-by: Alexander Korotkov, Álvaro Herrera Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org	2016-04-05 18:38:54 -03:00
Alvaro Herrera	473b932870	Support CREATE ACCESS METHOD This enables external code to create access methods. This is useful so that extensions can add their own access methods which can be formally tracked for dependencies, so that DROP operates correctly. Also, having explicit support makes pg_dump work correctly. Currently only index AMs are supported, but we expect different types to be added in the future. Authors: Alexander Korotkov, Petr Jelínek Reviewed-By: Teodor Sigaev, Petr Jelínek, Jim Nasby Commitfest-URL: https://commitfest.postgresql.org/9/353/ Discussion: https://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com	2016-03-23 23:01:35 -03:00
Robert Haas	c16dc1aca5	Add simple VACUUM progress reporting. There's a lot more that could be done here yet - in particular, this reports only very coarse-grained information about the index vacuuming phase - but even as it stands, the new pg_stat_progress_vacuum can tell you quite a bit about what a long-running vacuum is actually doing. Amit Langote and Robert Haas, based on earlier work by Vinayak Pokale and Rahila Syed.	2016-03-15 13:32:56 -04:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Alvaro Herrera	756e7b4c9d	Rework internals of changing a type's ownership This is necessary so that REASSIGN OWNED does the right thing with composite types, to wit, that it also alters ownership of the type's pg_class entry -- previously, the pg_class entry remained owned by the original user, which caused later other failures such as the new owner's inability to use ALTER TYPE to rename an attribute of the affected composite. Also, if the original owner is later dropped, the pg_class entry becomes owned by a non-existant user which is bogus. To fix, create a new routine AlterTypeOwner_oid which knows whether to pass the request to ATExecChangeOwner or deal with it directly, and use that in shdepReassignOwner rather than calling AlterTypeOwnerInternal directly. AlterTypeOwnerInternal is now simpler in that it only modifies the pg_type entry and recurses to handle a possible array type; higher-level tasks are handled by either AlterTypeOwner directly or AlterTypeOwner_oid. I took the opportunity to add a few more objects to the test rig for REASSIGN OWNED, so that more cases are exercised. Additional ones could be added for superuser-only-ownable objects (such as FDWs and event triggers) but I didn't want to push my luck by adding a new superuser to the tests on a backpatchable bug fix. Per bug #13666 reported by Chris Pacejo. Backpatch to 9.5. (I would back-patch this all the way back, except that it doesn't apply cleanly in 9.4 and earlier because `59367fdf9` wasn't backpatched. If we decide that we need this in earlier branches too, we should backpatch both.)	2015-12-17 14:25:41 -03:00
Stephen Frost	833728d4c8	Handle policies during DROP OWNED BY DROP OWNED BY handled GRANT-based ACLs but was not removing roles from policies. Fix that by having DROP OWNED BY remove the role specified from the list of roles the policy (or policies) apply to, or the entire policy (or policies) if it only applied to the role specified. As with ACLs, the DROP OWNED BY caller must have permission to modify the policy or a WARNING is thrown and no change is made to the policy.	2015-12-11 16:12:25 -05:00
Alvaro Herrera	1aba62ec63	Allow per-tablespace effective_io_concurrency Per discussion, nowadays it is possible to have tablespaces that have wildly different I/O characteristics from others. Setting different effective_io_concurrency parameters for those has been measured to improve performance. Author: Julien Rouhaud Reviewed by: Andres Freund	2015-09-08 12:51:42 -03:00
Joe Conway	d824e2800f	Disallow converting a table to a view if row security is present. When DefineQueryRewrite() is about to convert a table to a view, it checks the table for features unavailable to views. For example, it rejects tables having triggers. It omits to reject tables having relrowsecurity or a pg_policy record. Fix that. To faciliate the repair, invent relation_has_policies() which indicates the presence of policies on a relation even when row security is disabled for that relation. Reported by Noah Misch. Patch by me, review by Stephen Frost. Back-patch to 9.5 where RLS was introduced.	2015-07-28 16:24:01 -07:00
Robert Haas	a04bb65f70	Add new function pg_notification_queue_usage. This tells you what fraction of NOTIFY's queue is currently filled. Brendan Jurd, reviewed by Merlin Moncure and Gurjeet Singh. A few further tweaks by me.	2015-07-17 09:12:03 -04:00
Heikki Linnakangas	321eed5f0f	Add ALTER OPERATOR command, for changing selectivity estimator functions. Other options cannot be changed, as it's not totally clear if cached plans would need to be invalidated if one of the other options change. Selectivity estimator functions only change plan costs, not correctness of plans, so those should be safe. Original patch by Uriy Zhuravlev, heavily edited by me.	2015-07-14 18:17:55 +03:00
Alvaro Herrera	7d60b2af34	Fix DDL command collection for TRANSFORM Commit `b488c580ae`, which added the DDL command collection feature, neglected to update the code that commit `cac7658205` had previously added two weeks earlier for the TRANSFORM feature. Reported by Michael Paquier.	2015-06-26 18:17:54 -03:00
Robert Haas	8f15f74a44	Be more conservative about removing tablespace "symlinks". Don't apply rmtree(), which will gleefully remove an entire subtree, and don't even apply unlink() unless it's symlink or a directory, the only things that we expect to find. Amit Kapila, with minor tweaks by me, per extensive discussions involving Andrew Dunstan, Fujii Masao, and Heikki Linnakangas, at least some of whom also reviewed the code.	2015-06-26 15:53:13 -04:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Tom Lane	4db485e75b	Put back a backwards-compatible version of sampling support functions. Commit `83e176ec18` removed the longstanding support functions for block sampling without any consideration of the impact this would have on third-party FDWs. The new API is not notably more functional for FDWs than the old, so forcing them to change doesn't seem like a good thing. We can provide the old API as a wrapper (more or less) around the new one for a minimal amount of extra code.	2015-05-18 18:34:37 -04:00
Andres Freund	f3d3118532	Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com	2015-05-16 03:46:31 +02:00
Fujii Masao	ecd222e770	Support VERBOSE option in REINDEX command. When this option is specified, a progress report is printed as each index is reindexed. Per discussion, we agreed on the following syntax for the extensibility of the options. REINDEX (flexible options) { INDEX \| ... } name Sawada Masahiko. Reviewed by Robert Haas, Fabrízio Mello, Alvaro Herrera, Kyotaro Horiguchi, Jim Nasby and me. Discussion: CAD21AoA0pK3YcOZAFzMae+2fcc3oGp5zoRggDyMNg5zoaWDhdQ@mail.gmail.com	2015-05-15 20:09:57 +09:00
Simon Riggs	83e176ec18	Separate block sampling functions Refactoring ahead of tablesample patch Requested and reviewed by Michael Paquier Petr Jelinek	2015-05-15 04:02:54 +02:00
Alvaro Herrera	b488c580ae	Allow on-the-fly capture of DDL event details This feature lets user code inspect and take action on DDL events. Whenever a ddl_command_end event trigger is installed, DDL actions executed are saved to a list which can be inspected during execution of a function attached to ddl_command_end. The set-returning function pg_event_trigger_ddl_commands can be used to list actions so captured; it returns data about the type of command executed, as well as the affected object. This is sufficient for many uses of this feature. For the cases where it is not, we also provide a "command" column of a new pseudo-type pg_ddl_command, which is a pointer to a C structure that can be accessed by C code. The struct contains all the info necessary to completely inspect and even reconstruct the executed command. There is no actual deparse code here; that's expected to come later. What we have is enough infrastructure that the deparsing can be done in an external extension. The intention is that we will add some deparsing code in a later release, as an in-core extension. A new test module is included. It's probably insufficient as is, but it should be sufficient as a starting point for a more complete and future-proof approach. Authors: Álvaro Herrera, with some help from Andres Freund, Ian Barwick, Abhijit Menon-Sen. Reviews by Andres Freund, Robert Haas, Amit Kapila, Michael Paquier, Craig Ringer, David Steele. Additional input from Chris Browne, Dimitri Fontaine, Stephen Frost, Petr Jelínek, Tom Lane, Jim Nasby, Steven Singer, Pavel Stěhule. Based on original work by Dimitri Fontaine, though I didn't use his code. Discussion: https://www.postgresql.org/message-id/m2txrsdzxa.fsf@2ndQuadrant.fr https://www.postgresql.org/message-id/20131108153322.GU5809@eldon.alvh.no-ip.org https://www.postgresql.org/message-id/20150215044814.GL3391@alvh.no-ip.org	2015-05-11 19:14:31 -03:00
Peter Eisentraut	cac7658205	Add transforms feature This provides a mechanism for specifying conversions between SQL data types and procedural languages. As examples, there are transforms for hstore and ltree for PL/Perl and PL/Python. reviews by Pavel Stěhule and Andres Freund	2015-04-26 10:33:14 -04:00
Alvaro Herrera	4ff695b17d	Add log_min_autovacuum_duration per-table option This is useful to control autovacuum log volume, for situations where monitoring only a set of tables is necessary. Author: Michael Paquier Reviewed by: A team led by Naoya Anzai (also including Akira Kurosawa, Taiki Kondo, Huong Dangminh), Fujii Masao.	2015-04-03 11:55:50 -03:00
Alvaro Herrera	8217fb1441	Add OID output argument to DefineTSConfiguration ... which is set to the OID of a copied text search config, whenever the COPY clause is used. This is in the spirit of commit `a2e35b53c3`.	2015-03-25 15:57:08 -03:00
Alvaro Herrera	0d83138974	Rationalize vacuuming options and parameters We were involving the parser too much in setting up initial vacuuming parameters. This patch moves that responsibility elsewhere to simplify code, and also to make future additions easier. To do this, create a new struct VacuumParams which is filled just prior to vacuum execution, instead of at parse time; for user-invoked vacuuming this is set up in a new function ExecVacuum, while autovacuum sets it up by itself. While at it, add a new member VACOPT_SKIPTOAST to enum VacuumOption, only set by autovacuum, which is used to disable vacuuming of the toast table instead of the old do_toast parameter; this relieves the argument list of vacuum() and some callees a bit. This partially makes up for having added more arguments in an effort to avoid having autovacuum from constructing a VacuumStmt parse node. Author: Michael Paquier. Some tweaks by Álvaro Reviewed by: Robert Haas, Stephen Frost, Álvaro Herrera	2015-03-18 11:52:33 -03:00
Alvaro Herrera	31eae6028e	Allow CURRENT/SESSION_USER to be used in certain commands Commands such as ALTER USER, ALTER GROUP, ALTER ROLE, GRANT, and the various ALTER OBJECT / OWNER TO, as well as ad-hoc clauses related to roles such as the AUTHORIZATION clause of CREATE SCHEMA, the FOR clause of CREATE USER MAPPING, and the FOR ROLE clause of ALTER DEFAULT PRIVILEGES can now take the keywords CURRENT_USER and SESSION_USER as user specifiers in place of an explicit user name. This commit also fixes some quite ugly handling of special standards- mandated syntax in CREATE USER MAPPING, which in particular would fail to work in presence of a role named "current_user". The special role specifiers PUBLIC and NONE also have more consistent handling now. Also take the opportunity to add location tracking to user specifiers. Authors: Kyotaro Horiguchi. Heavily reworked by Álvaro Herrera. Reviewed by: Rushabh Lathia, Adam Brightwell, Marti Raudsepp.	2015-03-09 15:41:54 -03:00
Heikki Linnakangas	f1fd515b39	Move WAL-related definitions from dbcommands.h to separate header file. This makes it easier to write frontend programs that needs to understand the WAL record format of CREATE/DROP DATABASE. dbcommands.h cannot easily be #included in a frontend program, because it pulls in other header files that need backend stuff, but the new dbcommands_xlog.h header file has fewer dependencies.	2015-03-09 15:50:49 +02:00
Tom Lane	90c35a9ed0	Code cleanup for REINDEX DATABASE/SCHEMA/SYSTEM. Fix some minor infelicities. Some of these things were introduced in commit `fe263d115a`, and some are older.	2015-03-08 12:18:43 -04:00
Alvaro Herrera	a2e35b53c3	Change many routines to return ObjectAddress rather than OID The changed routines are mostly those that can be directly called by ProcessUtilitySlow; the intention is to make the affected object information more precise, in support for future event trigger changes. Originally it was envisioned that the OID of the affected object would be enough, and in most cases that is correct, but upon actually implementing the event trigger changes it turned out that ObjectAddress is more widely useful. Additionally, some command execution routines grew an output argument that's an object address which provides further info about the executed command. To wit: * for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of the new constraint * for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the schema that originally contained the object. * for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address of the object added to or dropped from the extension. There's no user-visible change in this commit, and no functional change either. Discussion: 20150218213255.GC6717@tamriel.snowman.net Reviewed-By: Stephen Frost, Andres Freund	2015-03-03 14:10:50 -03:00
Tom Lane	8abb3cda0d	Use the typcache to cache constraints for domain types. Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches.	2015-03-01 14:06:55 -05:00
Alvaro Herrera	296f3a6053	Support more commands in event triggers COMMENT, SECURITY LABEL, and GRANT/REVOKE now also fire ddl_command_start and ddl_command_end event triggers, when they operate on database-local objects. Reviewed-By: Michael Paquier, Andres Freund, Stephen Frost	2015-02-23 14:22:42 -03:00
Tom Lane	09d8d110a6	Use FLEXIBLE_ARRAY_MEMBER in a bunch more places. Replace some bogus "x[1]" declarations with "x[FLEXIBLE_ARRAY_MEMBER]". Aside from being more self-documenting, this should help prevent bogus warnings from static code analyzers and perhaps compiler misoptimizations. This patch is just a down payment on eliminating the whole problem, but it gets rid of a lot of easy-to-fix cases. Note that the main problem with doing this is that one must no longer rely on computing sizeof(the containing struct), since the result would be compiler-dependent. Instead use offsetof(struct, lastfield). Autoconf also warns against spelling that offsetof(struct, lastfield[0]). Michael Paquier, review and additional fixes by me.	2015-02-20 00:11:42 -05:00
Andres Freund	4f85fde8eb	Introduce and use infrastructure for interrupt processing during client reads. Up to now large swathes of backend code ran inside signal handlers while reading commands from the client, to allow for speedy reaction to asynchronous events. Most prominently shared invalidation and NOTIFY handling. That means that complex code like the starting/stopping of transactions is run in signal handlers... The required code was fragile and verbose, and is likely to contain bugs. That approach also severely limited what could be done while communicating with the client. As the read might be from within openssl it wasn't safely possible to trigger an error, e.g. to cancel a backend in idle-in-transaction state. We did that in some cases, namely fatal errors, nonetheless. Now that FE/BE communication in the backend employs non-blocking sockets and latches to block, we can quite simply interrupt reads from signal handlers by setting the latch. That allows us to signal an interrupted read, which is supposed to be retried after returning from within the ssl library. As signal handlers now only need to set the latch to guarantee timely interrupt processing, remove a fair amount of complicated & fragile code from async.c and sinval.c. We could now actually start to process some kinds of interrupts, like sinval ones, more often that before, but that seems better done separately. This work will hopefully allow to handle cases like being blocked by sending data, interrupting idle transactions and similar to be implemented without too much effort. In addition to allowing getting rid of ImmediateInterruptOK, that is. Author: Andres Freund Reviewed-By: Heikki Linnakangas	2015-02-03 22:25:20 +01:00
Tom Lane	8e166e164c	Rearrange explain.c's API so callers need not embed sizeof(ExplainState). The folly of the previous arrangement was just demonstrated: there's no convenient way to add fields to ExplainState without breaking ABI, even if callers have no need to touch those fields. Since we might well need to do that again someday in back branches, let's change things so that only explain.c has to have sizeof(ExplainState) compiled into it. This costs one extra palloc() per EXPLAIN operation, which is surely pretty negligible.	2015-01-15 13:39:33 -05:00
Tom Lane	a5cd70dcbc	Improve performance of EXPLAIN with large range tables. As of 9.3, ruleutils.c goes to some lengths to ensure that table and column aliases used in its output are unique. Of course this takes more time than was required before, which in itself isn't fatal. However, EXPLAIN was set up so that recalculation of the unique aliases was repeated for each subexpression printed in a plan. That results in O(N^2) time and memory consumption for large plan trees, which did not happen in older branches. Fortunately, the expensive work is the same across a whole plan tree, so there is no need to repeat it; we can do most of the initialization just once per query and re-use it for each subexpression. This buys back most (not all) of the performance loss since 9.2. We need an extra ExplainState field to hold the precalculated deparse context. That's no problem in HEAD, but in the back branches, expanding sizeof(ExplainState) seems risky because third-party extensions might have local variables of that struct type. So, in 9.4 and 9.3, introduce an auxiliary struct to keep sizeof(ExplainState) the same. We should refactor the APIs to avoid such local variables in future, but that's material for a separate HEAD-only commit. Per gripe from Alexey Bashtanov. Back-patch to 9.3 where the issue was introduced.	2015-01-15 13:18:12 -05:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Alvaro Herrera	0ee98d1cbf	pg_event_trigger_dropped_objects: add behavior flags Add "normal" and "original" flags as output columns to the pg_event_trigger_dropped_objects() function. With this it's possible to distinguish which objects, among those listed, need to be explicitely referenced when trying to replicate a deletion. This is necessary so that the list of objects can be pruned to the minimum necessary to replicate the DROP command in a remote server that might have slightly different schema (for instance, TOAST tables and constraints with different names and such.) Catalog version bumped due to change of function definition. Reviewed by: Abhijit Menon-Sen, Stephen Frost, Heikki Linnakangas, Robert Haas.	2014-12-19 15:00:45 -03:00
Simon Riggs	fe263d115a	REINDEX SCHEMA Add new SCHEMA option to REINDEX and reindexdb. Sawada Masahiko Reviewed by Michael Paquier and Fabrízio de Royes Mello	2014-12-09 00:28:00 +09:00
Simon Riggs	618c9430a8	Event Trigger for table_rewrite Generate a table_rewrite event when ALTER TABLE attempts to rewrite a table. Provide helper functions to identify table and reason. Intended use case is to help assess or to react to schema changes that might hold exclusive locks for long periods. Dimitri Fontaine, triggering an edit by Simon Riggs Reviewed in detail by Michael Paquier	2014-12-08 00:55:28 +09:00
Heikki Linnakangas	2c03216d83	Revamp the WAL record format. Each WAL record now carries information about the modified relation and block(s) in a standardized format. That makes it easier to write tools that need that information, like pg_rewind, prefetching the blocks to speed up recovery, etc. There's a whole new API for building WAL records, replacing the XLogRecData chains used previously. The new API consists of XLogRegister* functions, which are called for each buffer and chunk of data that is added to the record. The new API also gives more control over when a full-page image is written, by passing flags to the XLogRegisterBuffer function. This also simplifies the XLogReadBufferForRedo() calls. The function can dig the relation and block number from the WAL record, so they no longer need to be passed as arguments. For the convenience of redo routines, XLogReader now disects each WAL record after reading it, copying the main data part and the per-block data into MAXALIGNed buffers. The data chunks are not aligned within the WAL record, but the redo routines can assume that the pointers returned by XLogRecGet* functions are. Redo routines are now passed the XLogReaderState, which contains the record in the already-disected format, instead of the plain XLogRecord. The new record format also makes the fixed size XLogRecord header smaller, by removing the xl_len field. The length of the "main data" portion is now stored at the end of the WAL record, and there's a separate header after XLogRecord for it. The alignment padding at the end of XLogRecord is also removed. This compansates for the fact that the new format would otherwise be more bulky than the old format. Reviewed by Andres Freund, Amit Kapila, Michael Paquier, Alvaro Herrera, Fujii Masao.	2014-11-20 18:46:41 +02:00
Alvaro Herrera	85b506bbfc	Get rid of SET LOGGED indexes persistence kludge This removes ATChangeIndexesPersistence() introduced by `f41872d0c1` which was too ugly to live for long. Instead, the correct persistence marking is passed all the way down to reindex_index, so that the transient relation built to contain the index relfilenode can get marked correctly right from the start. Author: Fabrízio de Royes Mello Review and editorialization by Michael Paquier and Álvaro Herrera	2014-11-15 01:19:49 -03:00
Heikki Linnakangas	2076db2aea	Move the backup-block logic from XLogInsert to a new file, xloginsert.c. xlog.c is huge, this makes it a little bit smaller, which is nice. Functions related to putting together the WAL record are in xloginsert.c, and the lower level stuff for managing WAL buffers and such are in xlog.c. Also move the definition of XLogRecord to a separate header file. This causes churn in the #includes of all the files that write WAL records, and redo routines, but it avoids pulling in xlog.h into most places. Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.	2014-11-06 13:55:36 +02:00
Tom Lane	fd0f651a86	Test IsInTransactionChain, not IsTransactionBlock, in vac_update_relstats. As noted by Noah Misch, my initial cut at fixing bug #11638 didn't cover all cases where ANALYZE might be invoked in an unsafe context. We need to test the result of IsInTransactionChain not IsTransactionBlock; which is notationally a pain because IsInTransactionChain requires an isTopLevel flag, which would have to be passed down through several levels of callers. I chose to pass in_outer_xact (ie, the result of IsInTransactionChain) rather than isTopLevel per se, as that seemed marginally more apropos for the intermediate functions to know about.	2014-10-30 13:04:06 -04:00
Tom Lane	90063a7612	Print planning time only in EXPLAIN ANALYZE, not plain EXPLAIN. We've gotten enough push-back on that change to make it clear that it wasn't an especially good idea to do it like that. Revert plain EXPLAIN to its previous behavior, but keep the extra output in EXPLAIN ANALYZE. Per discussion. Internally, I set this up as a separate flag ExplainState.summary that controls printing of planning time and execution time. For now it's just copied from the ANALYZE option, but we could consider exposing it to users.	2014-10-15 18:50:13 -04:00
Stephen Frost	6550b901fe	Code review for row security. Buildfarm member tick identified an issue where the policies in the relcache for a relation were were being replaced underneath a running query, leading to segfaults while processing the policies to be added to a query. Similar to how TupleDesc RuleLocks are handled, add in a equalRSDesc() function to check if the policies have actually changed and, if not, swap back the rsdesc field (using the original instead of the temporairly built one; the whole structure is swapped and then specific fields swapped back). This now passes a CLOBBER_CACHE_ALWAYS for me and should resolve the buildfarm error. In addition to addressing this, add a new chapter in Data Definition under Privileges which explains row security and provides examples of its usage, change \d to always list policies (even if row security is disabled- but note that it is disabled, or enabled with no policies), rework check_role_for_policy (it really didn't need the entire policy, but it did need to be using has_privs_of_role()), and change the field in pg_class to relrowsecurity from relhasrowsecurity, based on Heikki's suggestion. Also from Heikki, only issue SET ROW_SECURITY in pg_restore when talking to a 9.5+ server, list Bypass RLS in \du, and document --enable-row-security options for pg_dump and pg_restore. Lastly, fix a number of minor whitespace and typo issues from Heikki, Dimitri, add a missing #include, per Peter E, fix a few minor variable-assigned-but-not-used and resource leak issues from Coverity and add tab completion for role attribute bypassrls as well.	2014-09-24 16:32:22 -04:00
Stephen Frost	491c029dbc	Row-Level Security Policies (RLS) Building on the updatable security-barrier views work, add the ability to define policies on tables to limit the set of rows which are returned from a query and which are allowed to be added to a table. Expressions defined by the policy for filtering are added to the security barrier quals of the query, while expressions defined to check records being added to a table are added to the with-check options of the query. New top-level commands are CREATE/ALTER/DROP POLICY and are controlled by the table owner. Row Security is able to be enabled and disabled by the owner on a per-table basis using ALTER TABLE .. ENABLE/DISABLE ROW SECURITY. Per discussion, ROW SECURITY is disabled on tables by default and must be enabled for policies on the table to be used. If no policies exist on a table with ROW SECURITY enabled, a default-deny policy is used and no records will be visible. By default, row security is applied at all times except for the table owner and the superuser. A new GUC, row_security, is added which can be set to ON, OFF, or FORCE. When set to FORCE, row security will be applied even for the table owner and superusers. When set to OFF, row security will be disabled when allowed and an error will be thrown if the user does not have rights to bypass row security. Per discussion, pg_dump sets row_security = OFF by default to ensure that exports and backups will have all data in the table or will error if there are insufficient privileges to bypass row security. A new option has been added to pg_dump, --enable-row-security, to ask pg_dump to export with row security enabled. A new role capability, BYPASSRLS, which can only be set by the superuser, is added to allow other users to be able to bypass row security using row_security = OFF. Many thanks to the various individuals who have helped with the design, particularly Robert Haas for his feedback. Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean Rasheed, with additional changes and rework by me. Reviewers have included all of the above, Greg Smith, Jeff McCormick, and Robert Haas.	2014-09-19 11:18:35 -04:00
Andres Freund	728f152e07	Add rmgr callback to name xlog record types for display purposes. This is primarily useful for the upcoming pg_xlogdump --stats feature, but also allows to remove some duplicated code in the rmgr_desc routines. Due to the separation and harmonization, the output of dipsplayed records changes somewhat. But since this isn't enduser oriented content that's ok. It's potentially desirable to further change pg_xlogdump's display of records. It previously wasn't possible to show the record type separately from the description forcing it to be in the last column. But that's better done in a separate commit. Author: Abhijit Menon-Sen, slightly editorialized by me Reviewed-By: Álvaro Herrera, Andres Freund, and Heikki Linnakangas Discussion: 20140604104716.GA3989@toroid.org	2014-09-19 16:20:29 +02:00
Alvaro Herrera	301fcf33eb	Have CREATE TABLE AS and REFRESH return an OID Other DDL commands are already returning the OID, which is required for future additional event trigger work. This is merely making these commands in line with the rest of utility command support.	2014-08-25 15:32:18 -04:00
Alvaro Herrera	f41872d0c1	Implement ALTER TABLE .. SET LOGGED / UNLOGGED This enables changing permanent (logged) tables to unlogged and vice-versa. (Docs for ALTER TABLE / SET TABLESPACE got shuffled in an order that hopefully makes more sense than the original.) Author: Fabrízio de Royes Mello Reviewed by: Christoph Berg, Andres Freund, Thom Brown Some tweaking by Álvaro Herrera	2014-08-22 14:27:00 -04:00
Stephen Frost	3c4cf08087	Rework 'MOVE ALL' to 'ALTER .. ALL IN TABLESPACE' As 'ALTER TABLESPACE .. MOVE ALL' really didn't change the tablespace but instead changed objects inside tablespaces, it made sense to rework the syntax and supporting functions to operate under the 'ALTER (TABLE\|INDEX\|MATERIALIZED VIEW)' syntax and to be in tablecmds.c. Pointed out by Alvaro, who also suggested the new syntax. Back-patch to 9.4.	2014-08-21 19:06:17 -04:00
Tom Lane	59efda3e50	Implement IMPORT FOREIGN SCHEMA. This command provides an automated way to create foreign table definitions that match remote tables, thereby reducing tedium and chances for error. In this patch, we provide the necessary core-server infrastructure and implement the feature fully in the postgres_fdw foreign-data wrapper. Other wrappers will throw a "feature not supported" error until/unless they are updated. Ronan Dunklau and Michael Paquier, additional work by me	2014-07-10 15:01:43 -04:00
Heikki Linnakangas	0ef0b6784c	Change the signature of rm_desc so that it's passed a XLogRecord. Just feels more natural, and is more consistent with rm_redo.	2014-06-14 10:46:48 +03:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Noah Misch	7cbe57c34d	Offer triggers on foreign tables. This covers all the SQL-standard trigger types supported for regular tables; it does not cover constraint triggers. The approach for acquiring the old row mirrors that for view INSTEAD OF triggers. For AFTER ROW triggers, we spool the foreign tuples to a tuplestore. This changes the FDW API contract; when deciding which columns to populate in the slot returned from data modification callbacks, writable FDWs will need to check for AFTER ROW triggers in addition to checking for a RETURNING clause. In support of the feature addition, refactor the TriggerFlags bits and the assembly of old tuples in ModifyTable. Ronan Dunklau, reviewed by KaiGai Kohei; some additional hacking by me.	2014-03-23 02:16:34 -04:00
Robert Haas	b89e151054	Introduce logical decoding. This feature, building on previous commits, allows the write-ahead log stream to be decoded into a series of logical changes; that is, inserts, updates, and deletes and the transactions which contain them. It is capable of handling decoding even across changes to the schema of the effected tables. The output format is controlled by a so-called "output plugin"; an example is included. To make use of this in a real replication system, the output plugin will need to be modified to produce output in the format appropriate to that system, and to perform filtering. Currently, information can be extracted from the logical decoding system only via SQL; future commits will add the ability to stream changes via walsender. Andres Freund, with review and other contributions from many other people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan, Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve Singer.	2014-03-03 16:32:18 -05:00
Robert Haas	5f173040e3	Avoid repeated name lookups during table and index DDL. If the name lookups come to different conclusions due to concurrent activity, we might perform some parts of the DDL on a different table than other parts. At least in the case of CREATE INDEX, this can be used to cause the permissions checks to be performed against a different table than the index creation, allowing for a privilege escalation attack. This changes the calling convention for DefineIndex, CreateTrigger, transformIndexStmt, transformAlterTableStmt, CheckIndexCompatible (in 9.2 and newer), and AlterTable (in 9.1 and older). In addition, CheckRelationOwnership is removed in 9.2 and newer and the calling convention is changed in older branches. A field has also been added to the Constraint node (FkConstraint in 8.4). Third-party code calling these functions or using the Constraint node will require updating. Report by Andres Freund. Patch by Robert Haas and Andres Freund, reviewed by Tom Lane. Security: CVE-2014-0062	2014-02-17 09:33:31 -05:00
Alvaro Herrera	801c2dc72c	Separate multixact freezing parameters from xid's Previously we were piggybacking on transaction ID parameters to freeze multixacts; but since there isn't necessarily any relationship between rates of Xid and multixact consumption, this turns out not to be a good idea. Therefore, we now have multixact-specific freezing parameters: vacuum_multixact_freeze_min_age: when to remove multis as we come across them in vacuum (default to 5 million, i.e. early in comparison to Xid's default of 50 million) vacuum_multixact_freeze_table_age: when to force whole-table scans instead of scanning only the pages marked as not all visible in visibility map (default to 150 million, same as for Xids). Whichever of both which reaches the 150 million mark earlier will cause a whole-table scan. autovacuum_multixact_freeze_max_age: when for cause emergency, uninterruptible whole-table scans (default to 400 million, double as that for Xids). This means there shouldn't be more frequent emergency vacuuming than previously, unless multixacts are being used very rapidly. Backpatch to 9.3 where multixacts were made to persist enough to require freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of fields in an unnatural place, and StdRdOptions is split in two so that the newly added fields can go at the end. Patch by me, reviewed by Robert Haas, with additional input from Andres Freund and Tom Lane.	2014-02-13 19:36:31 -03:00
Robert Haas	9347baa5bb	Include planning time in EXPLAIN ANALYZE output. This doesn't work for prepared queries, but it's not too easy to get the information in that case and there's some debate as to exactly what the right thing to measure is, so just do this for now. Andreas Karlsson, with slight doc changes by me.	2014-01-29 16:09:15 -05:00
Stephen Frost	fbe19ee3b8	ALTER TABLESPACE ... MOVE ... OWNED BY Add the ability to specify the objects to move by who those objects are owned by (as relowner) and change ALL to mean ALL objects. This makes the command always operate against a well-defined set of objects and not have the objects-to-be-moved based on the role of the user running the command. Per discussion with Simon and Tom.	2014-01-23 23:52:40 -05:00
Alvaro Herrera	d2458e3b20	Expose a routine to print triggers during EXPLAIN ANALYZE This is so that auto_explain can use it. Kyotaro HORIGUCHI	2014-01-20 17:13:47 -03:00
Stephen Frost	76e91b38ba	Add ALTER TABLESPACE ... MOVE command This adds a 'MOVE' sub-command to ALTER TABLESPACE which allows moving sets of objects from one tablespace to another. This can be extremely handy and avoids a lot of error-prone scripting. ALTER TABLESPACE ... MOVE will only move objects the user owns, will notify the user if no objects were found, and can be used to move ALL objects or specific types of objects (TABLES, INDEXES, or MATERIALIZED VIEWS).	2014-01-18 18:56:40 -05:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Robert Haas	3cff1879f8	Aggressively freeze tables when CLUSTER or VACUUM FULL rewrites them. We haven't wanted to do this in the past on the grounds that in rare cases the original xmin value will be needed for forensic purposes, but commit `37484ad2aa` removes that objection, so now we can. Per extensive discussion, among many people, on pgsql-hackers.	2014-01-02 15:15:51 -05:00
Tom Lane	8d65da1f01	Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane	2013-12-23 16:11:35 -05:00
Alvaro Herrera	f54106f77e	Fix full-table-vacuum request mechanism for MultiXactIds While autovacuum dutifully launched anti-multixact-wraparound vacuums when the multixact "age" was reached, the vacuum code was not aware that it needed to make them be full table vacuums. As the resulting partial-table vacuums aren't capable of actually increasing relminmxid, autovacuum continued to launch anti-wraparound vacuums that didn't have the intended effect, until age of relfrozenxid caused the vacuum to finally be a full table one via vacuum_freeze_table_age. To fix, introduce logic for multixacts similar to that for plain TransactionIds, using the same GUCs. Backpatch to 9.3, where permanent MultiXactIds were introduced. Andres Freund, some cleanup by Álvaro	2013-11-29 21:47:13 -03:00
Tom Lane	6cb86143e8	Allow aggregates to provide estimates of their transition state data size. Formerly the planner had a hard-wired rule of thumb for guessing the amount of space consumed by an aggregate function's transition state data. This estimate is critical to deciding whether it's OK to use hash aggregation, and in many situations the built-in estimate isn't very good. This patch adds a column to pg_aggregate wherein a per-aggregate estimate can be provided, overriding the planner's default, and infrastructure for setting the column via CREATE AGGREGATE. It may be that additional smarts will be required in future, perhaps even a per-aggregate estimation function. But this is already a step forward. This is extracted from a larger patch to improve the performance of numeric and int8 aggregates. I (tgl) thought it was worth reviewing and committing this infrastructure separately. In this commit, all built-in aggregates are given aggtransspace = 0, so no behavior should change. Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra	2013-11-16 16:03:40 -05:00
Robert Haas	d90ced8bb2	Add DISCARD SEQUENCES command. DISCARD ALL will now discard cached sequence information, as well. Fabrízio de Royes Mello, reviewed by Zoltán Böszörményi, with some further tweaks by me.	2013-10-03 16:23:31 -04:00
Alvaro Herrera	dd778e9d88	Rename various "freeze multixact" variables It seems to make more sense to use "cutoff multixact" terminology throughout the backend code; "freeze" is associated with replacing of an Xid with FrozenTransactionId, which is not what we do for MultiXactIds. Andres Freund Some adjustments by Álvaro Herrera	2013-09-16 15:47:31 -03:00
Tom Lane	0d3f4406df	Allow aggregate functions to be VARIADIC. There's no inherent reason why an aggregate function can't be variadic (even VARIADIC ANY) if its transition function can handle the case. Indeed, this patch to add the feature touches none of the planner or executor, and little of the parser; the main missing stuff was DDL and pg_dump support. It is true that variadic aggregates can create the same sort of ambiguity about parameters versus ORDER BY keys that was complained of when we (briefly) had both one- and two-argument forms of string_agg(). However, the policy formed in response to that discussion only said that we'd not create any built-in aggregates with varying numbers of arguments, not that we shouldn't allow users to do it. So the logical extension of that is we can allow users to make variadic aggregates as long as we're wary about shipping any such in core. In passing, this patch allows aggregate function arguments to be named, to the extent of remembering the names in pg_proc and dumping them in pg_dump. You can't yet call an aggregate using named-parameter notation. That seems like a likely future extension, but it'll take some work, and it's not what this patch is really about. Likewise, there's still some work needed to make window functions handle VARIADIC fully, but I left that for another day. initdb forced because of new aggvariadic field in Aggref parse nodes.	2013-09-03 17:08:46 -04:00
Stephen Frost	4cbe3ac3e8	WITH CHECK OPTION support for auto-updatable VIEWs For simple views which are automatically updatable, this patch allows the user to specify what level of checking should be done on records being inserted or updated. For 'LOCAL CHECK', new tuples are validated against the conditionals of the view they are being inserted into, while for 'CASCADED CHECK' the new tuples are validated against the conditionals for all views involved (from the top down). This option is part of the SQL specification. Dean Rasheed, reviewed by Pavel Stehule	2013-07-18 17:10:16 -04:00
Kevin Grittner	cc1965a99b	Add support for REFRESH MATERIALIZED VIEW CONCURRENTLY. This allows reads to continue without any blocking while a REFRESH runs. The new data appears atomically as part of transaction commit. Review questioned the Assert that a matview was not a system relation. This will be addressed separately. Reviewed by Hitoshi Harada, Robert Haas, Andres Freund. Merged after review with security patch `f3ab5d4`.	2013-07-16 12:55:44 -05:00
Noah Misch	813895e4ac	Don't pass oidvector by value. Since the structure ends with a flexible array, doing so truncates any vector having more than one element. New in 9.3, so no back-patch.	2013-06-12 19:50:37 -04:00
Bruce Momjian	9af4159fce	pgindent run for release 9.3 This is the first run of the Perl-based pgindent script. Also update pgindent instructions.	2013-05-29 16:58:43 -04:00
Tom Lane	1d6c72a55b	Move materialized views' is-populated status into their pg_class entries. Previously this state was represented by whether the view's disk file had zero or nonzero size, which is problematic for numerous reasons, since it's breaking a fundamental assumption about heap storage. This was done to allow unlogged matviews to revert to unpopulated status after a crash despite our lack of any ability to update catalog entries post-crash. However, this poses enough risk of future problems that it seems better to not support unlogged matviews until we can find another way. Accordingly, revert that choice as well as a number of existing kluges forced by it in favor of creating a pg_class.relispopulated flag column.	2013-05-06 13:27:22 -04:00
Tom Lane	0b33790421	Clean up the mess around EXPLAIN and materialized views. Revert the matview-related changes in explain.c's API, as per recent complaint from Robert Haas. The reason for these appears to have been principally some ill-considered choices around having intorel_startup do what ought to be parse-time checking, plus a poor arrangement for passing it the view parsetree it needs to store into pg_rewrite when creating a materialized view. Do the latter by having parse analysis stick a copy into the IntoClause, instead of doing it at runtime. (On the whole, I seriously question the choice to represent CREATE MATERIALIZED VIEW as a variant of SELECT INTO/CREATE TABLE AS, because that means injecting even more complexity into what was already a horrid legacy kluge. However, I didn't go so far as to rethink that choice ... yet.) I also moved several error checks into matview parse analysis, and made the check for external Params in a matview more accurate. In passing, clean things up a bit more around interpretOidsOption(), and fix things so that we can use that to force no-oids for views, sequences, etc, thereby eliminating the need to cons up "oids = false" options when creating them. catversion bump due to change in IntoClause. (I wonder though if we really need readfuncs/outfuncs support for IntoClause anymore.)	2013-04-12 19:25:31 -04:00
Alvaro Herrera	6a76edb188	Fix confusion between ObjectType and ObjectClass Per report by Will Leinweber and Peter Eisentraut	2013-04-11 11:59:47 -03:00
Kevin Grittner	52e6e33ab4	Create a distinction between a populated matview and a scannable one. The intent was that being populated would, long term, be just one of the conditions which could affect whether a matview was scannable; being populated should be necessary but not always sufficient to scan the relation. Since only CREATE and REFRESH currently determine the scannability, names and comments accidentally conflated these concepts, leading to confusion. Also add missing locking for the SQL function which allows a test for scannability, and fix a modularity violatiion. Per complaints from Tom Lane, although its not clear that these will satisfy his concerns. Hopefully this will at least better frame the discussion.	2013-04-09 13:02:49 -05:00
Robert Haas	0bf42a5f3b	Adjust ExplainOneQuery_hook_type to take a DestReceiver argument. The materialized views patch adjusted ExplainOneQuery to take an additional DestReceiver argument, but failed to add a matching argument to the definition of ExplainOneQuery_hook. This is a problem for users of the hook that want to call ExplainOnePlan. Fix by adding the missing argument.	2013-04-09 10:25:08 -04:00
Alvaro Herrera	473ab40c8b	Add sql_drop event for event triggers This event takes place just before ddl_command_end, and is fired if and only if at least one object has been dropped by the command. (For instance, DROP TABLE IF EXISTS of a table that does not in fact exist will not lead to such a trigger firing). Commands that drop multiple objects (such as DROP SCHEMA or DROP OWNED BY) will cause a single event to fire. Some firings might be surprising, such as ALTER TABLE DROP COLUMN. The trigger is fired after the drop has taken place, because that has been deemed the safest design, to avoid exposing possibly-inconsistent internal state (system catalogs as well as current transaction) to the user function code. This means that careful tracking of object identification is required during the object removal phase. Like other currently existing events, there is support for tag filtering. To support the new event, add a new pg_event_trigger_dropped_objects() set-returning function, which returns a set of rows comprising the objects affected by the command. This is to be used within the user function code, and is mostly modelled after the recently introduced pg_identify_object() function. Catalog version bumped due to the new function. Dimitri Fontaine and Álvaro Herrera Review by Robert Haas, Tom Lane	2013-03-28 13:05:48 -03:00
Robert Haas	05f3f9c7b2	Extend object-access hook machinery to support post-alter events. This also slightly widens the scope of what we support in terms of post-create events. KaiGai Kohei, with a few changes, mostly to the comments, by me	2013-03-17 22:57:26 -04:00
Tom Lane	e11cb8ba2c	Fix missing #include in commands/matview.h. It needs parsenodes.h to be compilable regardless of previous headers.	2013-03-06 18:21:05 -05:00
Kevin Grittner	3bf3ab8c56	Add a materialized view relations. A materialized view has a rule just like a view and a heap and other physical properties like a table. The rule is only used to populate the table, references in queries refer to the materialized data. This is a minimal implementation, but should still be useful in many cases. Currently data is only populated "on demand" by the CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements. It is expected that future releases will add incremental updates with various timings, and that a more refined concept of defining what is "fresh" data will be developed. At some point it may even be possible to have queries use a materialized in place of references to underlying tables, but that requires the other above-mentioned features to be working first. Much of the documentation work by Robert Haas. Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja Security review by KaiGai Kohei, with a decision on how best to implement sepgsql still pending.	2013-03-03 18:23:31 -06:00
Heikki Linnakangas	3d009e45bd	Add support for piping COPY to/from an external program. This includes backend "COPY TO/FROM PROGRAM '...'" syntax, and corresponding psql \copy syntax. Like with reading/writing files, the backend version is superuser-only, and in the psql version, the program is run in the client. In the passing, the psql \copy STDIN/STDOUT syntax is subtly changed: if you the stdin/stdout is quoted, it's now interpreted as a filename. For example, "\copy foo from 'stdin'" now reads from a file called 'stdin', not from standard input. Before this, there was no way to specify a filename called stdin, stdout, pstdin or pstdout. This creates a new function in pgport, wait_result_to_str(), which can be used to convert the exit status of a process, as returned by wait(3), to a human-readable string. Etsuro Fujita, reviewed by Amit Kapila.	2013-02-27 18:22:31 +02:00
Alvaro Herrera	0ac5ad5134	Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov	2013-01-23 12:04:59 -03:00
Robert Haas	841a5150c5	Add ddl_command_end support for event triggers. Dimitri Fontaine, with slight changes by me	2013-01-21 18:00:24 -05:00
Alvaro Herrera	765cbfdc92	Refactor ALTER some-obj RENAME implementation Remove duplicate implementations of catalog munging and miscellaneous privilege checks. Instead rely on already existing data in objectaddress.c to do the work. Author: KaiGai Kohei, changes by me Reviewed by: Robert Haas, Álvaro Herrera, Dimitri Fontaine	2013-01-21 12:06:41 -03:00
Alvaro Herrera	7ac5760fa2	Rework order of checks in ALTER / SET SCHEMA When attempting to move an object into the schema in which it already was, for most objects classes we were correctly complaining about exactly that ("object is already in schema"); but for some other object classes, such as functions, we were instead complaining of a name collision ("object already exists in schema"). The latter is wrong and misleading, per complaint from Robert Haas in CA+TgmoZ0+gNf7RDKRc3u5rHXffP=QjqPZKGxb4BsPz65k7qnHQ@mail.gmail.com To fix, refactor the way these checks are done. As a bonus, the resulting code is smaller and can also share some code with Rename cases. While at it, remove use of getObjectDescriptionOids() in error messages. These are normally disallowed because of translatability considerations, but this one had slipped through since 9.1. (Not sure that this is worth backpatching, though, as it would create some untranslated messages in back branches.) This is loosely based on a patch by KaiGai Kohei, heavily reworked by me.	2013-01-15 13:23:43 -03:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Robert Haas	82b1b213ca	Adjust more backend functions to return OID rather than void. This is again intended to support extensions to the event trigger functionality. This may go a bit further than we need for that purpose, but there's some value in being consistent, and the OID may be useful for other purposes also. Dimitri Fontaine	2012-12-29 07:55:37 -05:00
Simon Riggs	42fa810c14	Fix more weird compiler messages caused by unmatched function prototypes. Andres Freund	2012-12-24 16:25:26 +00:00
Robert Haas	c504513f83	Adjust many backend functions to return OID rather than void. Extracted from a larger patch by Dimitri Fontaine. It is hoped that this will provide infrastructure for enriching the new event trigger functionality, but it seems possibly useful for other purposes as well.	2012-12-23 18:37:58 -05:00
Tom Lane	7b90469b71	Allow adding values to an enum type created in the current transaction. Normally it is unsafe to allow ALTER TYPE ADD VALUE in a transaction block, because instances of the value could be added to indexes later in the same transaction, and then they would still be accessible even if the transaction rolls back. However, we can allow this if the enum type itself was created in the current transaction, because then any such indexes would have to go away entirely on rollback. The reason for allowing this is to support pg_upgrade's new usage of pg_restore --single-transaction: in --binary-upgrade mode, pg_dump emits enum types as a succession of ALTER TYPE ADD VALUE commands so that it can preserve the values' OIDs. The support is a bit limited, so we'll leave it undocumented. Andres Freund	2012-12-01 14:27:30 -05:00
Alvaro Herrera	04f28bdb84	Fix ALTER EXTENSION / SET SCHEMA In its original conception, it was leaving some objects into the old schema, but without their proper pg_depend entries; this meant that the old schema could be dropped, causing future pg_dump calls to fail on the affected database. This was originally reported by Jeff Frost as #6704; there have been other complaints elsewhere that can probably be traced to this bug. To fix, be more consistent about altering a table's subsidiary objects along the table itself; this requires some restructuring in how tables are relocated when altering an extension -- hence the new AlterTableNamespaceInternal routine which encapsulates it for both the ALTER TABLE and the ALTER EXTENSION cases. There was another bug lurking here, which was unmasked after fixing the previous one: certain objects would be reached twice via the dependency graph, and the second attempt to move them would cause the entire operation to fail. Per discussion, it seems the best fix for this is to do more careful tracking of objects already moved: we now maintain a list of moved objects, to avoid attempting to do it twice for the same object. Authors: Alvaro Herrera, Dimitri Fontaine Reviewed by Tom Lane	2012-10-31 10:52:55 -03:00
Alvaro Herrera	994c36e01d	refactor ALTER some-obj SET OWNER implementation Remove duplicate implementation of catalog munging and miscellaneous privilege and consistency checks. Instead rely on already existing data in objectaddress.c to do the work. Author: KaiGai Kohei Tweaked by me Reviewed by Robert Haas	2012-10-03 18:07:46 -03:00
Alvaro Herrera	2164f9a125	Refactor "ALTER some-obj SET SCHEMA" implementation Instead of having each object type implement the catalog munging independently, centralize knowledge about how to do it and expand the existing table in objectaddress.c with enough data about each object type to support this operation. Author: KaiGai Kohei Tweaks by me Reviewed by Robert Haas	2012-10-02 18:13:54 -03:00
Tom Lane	11e131854f	Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.	2012-09-21 19:03:10 -04:00
Alvaro Herrera	21c09e99dc	Split heapam_xlog.h from heapam.h The heapam XLog functions are used by other modules, not all of which are interested in the rest of the heapam API. With this, we let them get just the XLog stuff in which they are interested and not pollute them with unrelated includes. Also, since heapam.h no longer requires xlog.h, many files that do include heapam.h no longer get xlog.h automatically, including a few headers. This is useful because heapam.h is getting pulled in by execnodes.h, which is in turn included by a lot of files.	2012-08-28 19:02:00 -04:00
Robert Haas	3a0e4d36eb	Make new event trigger facility actually do something. Commit `3855968f32` added syntax, pg_dump, psql support, and documentation, but the triggers didn't actually fire. With this commit, they now do. This is still a pretty basic facility overall because event triggers do not get a whole lot of information about what the user is trying to do unless you write them in C; and there's still no option to fire them anywhere except at the very beginning of the execution sequence, but it's better than nothing, and a good building block for future work. Along the way, add a regression test for ALTER LARGE OBJECT, since testing of event triggers reveals that we haven't got one. Dimitri Fontaine and Robert Haas	2012-07-20 11:39:01 -04:00
Robert Haas	3855968f32	Syntax support and documentation for event triggers. They don't actually do anything yet; that will get fixed in a follow-on commit. But this gets the basic infrastructure in place, including CREATE/ALTER/DROP EVENT TRIGGER; support for COMMENT, SECURITY LABEL, and ALTER EXTENSION .. ADD/DROP EVENT TRIGGER; pg_dump and psql support; and documentation for the anticipated initial feature set. Dimitri Fontaine, with review and a bunch of additional hacking by me. Thom Brown extensively reviewed earlier versions of this patch set, but there's not a whole lot of that code left in this commit, as it turns out.	2012-07-18 10:16:16 -04:00
Tom Lane	c92be3c059	Avoid pre-determining index names during CREATE TABLE LIKE parsing. Formerly, when trying to copy both indexes and comments, CREATE TABLE LIKE had to pre-assign names to indexes that had comments, because it made up an explicit CommentStmt command to apply the comment and so it had to know the name for the index. This creates bad interactions with other indexes, as shown in bug #6734 from Daniele Varrazzo: the preassignment logic couldn't take any other indexes into account so it could choose a conflicting name. To fix, add a field to IndexStmt that allows it to carry a comment to be assigned to the new index. (This isn't a user-exposed feature of CREATE INDEX, only an internal option.) Now we don't need preassignment of index names in any situation. I also took the opportunity to refactor DefineIndex to accept the IndexStmt as such, rather than passing all its fields individually in a mile-long parameter list. Back-patch to 9.2, but no further, because it seems too dangerous to change IndexStmt or DefineIndex's API in released branches. The bug exists back to 9.0 where CREATE TABLE LIKE grew the ability to copy comments, but given the lack of prior complaints we'll just let it go unfixed before 9.2.	2012-07-16 13:25:18 -04:00
Alvaro Herrera	0c7b9dc7d0	Have REASSIGN OWNED work on extensions, too Per bug #6593, REASSIGN OWNED fails when the affected role has created an extension. Even though the user related to the extension is not nominally the owner, its OID appears on pg_shdepend and thus causes problems when the user is to be dropped. This commit adds code to change the "ownership" of the extension itself, not of the contained objects. This is fine because it's currently only called from REASSIGN OWNED, which would also modify the ownership of the contained objects. However, this is not sufficient for a working ALTER OWNER implementation extension. Back-patch to 9.1, where extensions were introduced. Bug #6593 reported by Emiliano Leporati.	2012-07-03 15:09:59 -04:00
Tom Lane	541ffa65c3	Prevent CREATE TABLE LIKE/INHERITS from (mis) copying whole-row Vars. If a CHECK constraint or index definition contained a whole-row Var (that is, "table.*"), an attempt to copy that definition via CREATE TABLE LIKE or table inheritance produced incorrect results: the copied Var still claimed to have the rowtype of the source table, rather than the created table. For the LIKE case, it seems reasonable to just throw error for this situation, since the point of LIKE is that the new table is not permanently coupled to the old, so there's no reason to assume its rowtype will stay compatible. In the inheritance case, we should ideally allow such constraints, but doing so will require nontrivial refactoring of CREATE TABLE processing (because we'd need to know the OID of the new table's rowtype before we adjust inherited CHECK constraints). In view of the lack of previous complaints, that doesn't seem worth the risk in a back-patched bug fix, so just make it throw error for the inheritance case as well. Along the way, replace change_varattnos_of_a_node() with a more robust function map_variable_attnos(), which is capable of being extended to handle insertion of ConvertRowtypeExpr whenever we get around to fixing the inheritance case nicely, and in the meantime it returns a failure indication to the caller so that a helpful message with some context can be thrown. Also, this code will do the right thing with subselects (if we ever allow them in CHECK or indexes), and it range-checks varattnos before using them to index into the map array. Per report from Sergey Konoplev. Back-patch to all supported branches.	2012-06-30 16:45:14 -04:00
Peter Eisentraut	b8b2e3b2de	Replace int2/int4 in C code with int16/int32 The latter was already the dominant use, and it's preferable because in C the convention is that intXX means XX bits. Therefore, allowing mixed use of int2, int4, int8, int16, int32 is obviously confusing. Remove the typedefs for int2 and int4 for now. They don't seem to be widely used outside of the PostgreSQL source tree, and the few uses can probably be cleaned up by the time this ships.	2012-06-25 01:51:46 +03:00
Tom Lane	cfa0f4255b	Improve tests for whether we can skip queueing RI enforcement triggers. During an update of a PK row, we can skip firing the RI trigger if any old key value is NULL, because then the row could not have had any matching rows in the FK table. Conversely, during an update of an FK row, the outcome is determined if any new key value is NULL. In either case it becomes unnecessary to compare individual key values. This patch was inspired by discussion of Vik Reykja's patch to use IS NOT DISTINCT semantics for the key comparisons. In the event there is no need for that and so this patch looks nothing like his, but he should still get credit for having re-opened consideration of the trigger skip logic.	2012-06-19 20:07:33 -04:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Tom Lane	263d9de66b	Allow statistics to be collected for foreign tables. ANALYZE now accepts foreign tables and allows the table's FDW to control how the sample rows are collected. (But only manual ANALYZEs will touch foreign tables, for the moment, since among other things it's not very clear how to handle remote permissions checks in an auto-analyze.) contrib/file_fdw is extended to support this. Etsuro Fujita, reviewed by Shigeru Hanada, some further tweaking by me.	2012-04-06 15:02:35 -04:00
Peter Eisentraut	38b9693fd9	Add support for renaming domain constraints	2012-04-03 08:11:51 +03:00
Tom Lane	9dbf2b7d75	Restructure SELECT INTO's parsetree representation into CreateTableAsStmt. Making this operation look like a utility statement seems generally a good idea, and particularly so in light of the desire to provide command triggers for utility statements. The original choice of representing it as SELECT with an IntoClause appendage had metastasized into rather a lot of places, unfortunately, so that this patch is a great deal more complicated than one might at first expect. In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS subcommands required restructuring some EXPLAIN-related APIs. Add-on code that calls ExplainOnePlan or ExplainOneUtility, or uses ExplainOneQuery_hook, will need adjustment. Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO, which formerly were accepted though undocumented, are no longer accepted. The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE. The CREATE RULE case doesn't seem to have much real-world use (since the rule would work only once before failing with "table already exists"), so we'll not bother with that one. Both SELECT INTO and CREATE TABLE AS still return a command tag of "SELECT nnnn". There was some discussion of returning "CREATE TABLE nnnn", but for the moment backwards compatibility wins the day. Andres Freund and Tom Lane	2012-03-19 21:38:12 -04:00
Peter Eisentraut	39d74e346c	Add support for renaming constraints reviewed by Josh Berkus and Dimitri Fontaine	2012-03-10 20:19:13 +02:00
Tom Lane	0e5e167aae	Collect and use element-frequency statistics for arrays. This patch improves selectivity estimation for the array <@, &&, and @> (containment and overlaps) operators. It enables collection of statistics about individual array element values by ANALYZE, and introduces operator-specific estimators that use these stats. In addition, ScalarArrayOpExpr constructs of the forms "const = ANY/ALL (array_column)" and "const <> ANY/ALL (array_column)" are estimated by treating them as variants of the containment operators. Since we still collect scalar-style stats about the array values as a whole, the pg_stats view is expanded to show both these stats and the array-style stats in separate columns. This creates an incompatible change in how stats for tsvector columns are displayed in pg_stats: the stats about lexemes are now displayed in the array-related columns instead of the original scalar-related columns. There are a few loose ends here, notably that it'd be nice to be able to suppress either the scalar-style stats or the array-element stats for columns for which they're not useful. But the patch is in good enough shape to commit for wider testing. Alexander Korotkov, reviewed by Noah Misch and Nathan Boley	2012-03-03 20:20:57 -05:00
Peter Eisentraut	66f0cf7da8	Remove useless const qualifier Claiming that the typevar argument to DefineCompositeType() is const was a plain lie. A similar case in DefineVirtualRelation() was already changed in passing in commit `1575fbcb`. Also clean up the now unnecessary casts that used to cast away the const.	2012-02-26 15:22:27 +02:00
Alvaro Herrera	a417f85e1d	REASSIGN OWNED: Support foreign data wrappers and servers This was overlooked when implementing those kinds of objects, in commit `cae565e503`. Per report from Pawel Casperek.	2012-02-22 17:33:12 -03:00
Robert Haas	af7914c662	Add TIMING option to EXPLAIN, to allow eliminating of timing overhead. Sometimes it may be useful to get actual row counts out of EXPLAIN (ANALYZE) without paying the cost of timing every node entry/exit. With this patch, you can say EXPLAIN (ANALYZE, TIMING OFF) to get that. Tomas Vondra, reviewed by Eric Theise, with minor doc changes by me.	2012-02-07 11:23:04 -05:00
Peter Eisentraut	2787458362	Disallow ALTER DOMAIN on non-domain type everywhere This has been the behavior already in most cases, but through omission, ALTER DOMAIN / OWNER TO and ALTER DOMAIN / SET SCHEMA would silently work on non-domain types as well.	2012-01-27 21:20:34 +02:00
Alvaro Herrera	74ab96a45e	Add pg_trigger_depth() function This reports the depth level of triggers currently in execution, or zero if not called from inside a trigger. No catversion bump in this patch, but you have to initdb if you want access to the new function. Author: Kevin Grittner	2012-01-25 13:22:54 -03:00
Robert Haas	1489e2f26a	Improve behavior of concurrent ALTER TABLE, and do some refactoring. ALTER TABLE (and ALTER VIEW, ALTER SEQUENCE, etc.) now use a RangeVarGetRelid callback to check permissions before acquiring a table lock. We also now use the same callback for all forms of ALTER TABLE, rather than having separate, almost-identical callbacks for ALTER TABLE .. SET SCHEMA and ALTER TABLE .. RENAME, and no callback at all for everything else. I went ahead and changed the code so that no form of ALTER TABLE works on foreign tables; you must use ALTER FOREIGN TABLE instead. In 9.1, it was possible to use ALTER TABLE .. SET SCHEMA or ALTER TABLE .. RENAME on a foreign table, but not any other form of ALTER TABLE, which did not seem terribly useful or consistent. Patch by me; review by Noah Misch.	2012-01-06 22:42:26 -05:00
Peter Eisentraut	104e7dac28	Improve ALTER DOMAIN / DROP CONSTRAINT with nonexistent constraint ALTER DOMAIN / DROP CONSTRAINT on a nonexistent constraint name did not report any error. Now it reports an error. The IF EXISTS option was added to get the usual behavior of ignoring nonexistent objects to drop.	2012-01-05 19:48:55 +02:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Peter Eisentraut	f90dd28062	Add ALTER DOMAIN ... RENAME You could already rename domains using ALTER TYPE, but with this new command it is more consistent with how other commands treat domains as a subcategory of types.	2011-12-22 22:43:56 +02:00
Robert Haas	cbe24a6dd8	Improve behavior of concurrent CLUSTER. In the previous coding, a user could queue up for an AccessExclusiveLock on a table they did not have permission to cluster, thus potentially interfering with access by authorized users who got stuck waiting behind the AccessExclusiveLock. This approach avoids that. cluster() has the same permissions-checking requirements as REINDEX TABLE, so this commit moves the now-shared callback to tablecmds.c and renames it, per discussion with Noah Misch.	2011-12-21 15:17:28 -05:00
Robert Haas	74a1d4fe7c	Improve behavior of concurrent rename statements. Previously, renaming a table, sequence, view, index, foreign table, column, or trigger checked permissions before locking the object, which meant that if permissions were revoked during the lock wait, we would still allow the operation. Similarly, if the original object is dropped and a new one with the same name is created, the operation will be allowed if we had permissions on the old object; the permissions on the new object don't matter. All this is now fixed. Along the way, attempting to rename a trigger on a foreign table now gives the same error message as trying to create one there in the first place (i.e. that it's not a table or view) rather than simply stating that no trigger by that name exists. Patch by me; review by Noah Misch.	2011-12-15 19:02:38 -05:00
Peter Eisentraut	5bcf8ede45	Add ALTER FOREIGN DATA WRAPPER / RENAME and ALTER SERVER / RENAME	2011-12-09 20:42:30 +02:00
Robert Haas	fc6d1006bd	Further consolidation of DROP statement handling. This gets rid of an impressive amount of duplicative code, with only minimal behavior changes. DROP FOREIGN DATA WRAPPER now requires object ownership rather than superuser privileges, matching the documentation we already have. We also eliminate the historical warning about dropping a built-in function as unuseful. All operations are now performed in the same order for all object types handled by dropcmds.c. KaiGai Kohei, with minor revisions by me	2011-11-17 21:32:34 -05:00
Robert Haas	67dc4eed42	Remove ancient downcasing code from procedural language operations. A very long time ago, language names were specified as literals rather than identifiers, so this code was added to do case-folding. But that style has ben deprecated for many years so this isn't needed any more. Language names will still be downcased when specified as unquoted identifiers, but quoted identifiers or the old style using string literals will be left as-is.	2011-11-17 14:25:18 -05:00
Heikki Linnakangas	4429f6a9e3	Support range data types. Selectivity estimation functions are missing for some range type operators, which is a TODO. Jeff Davis	2011-11-03 13:42:15 +02:00
Robert Haas	82a4a777d9	Consolidate DROP handling for some object types. This gets rid of a significant amount of duplicative code. KaiGai Kohei, reviewed in earlier versions by Dimitri Fontaine, with further review and cleanup by me.	2011-10-19 23:27:19 -04:00
Tom Lane	e6858e6657	Measure the number of all-visible pages for use in index-only scan costing. Add a column pg_class.relallvisible to remember the number of pages that were all-visible according to the visibility map as of the last VACUUM (or ANALYZE, or some other operations that update pg_class.relpages). Use relallvisible/relpages, instead of an arbitrary constant, to estimate how many heap page fetches can be avoided during an index-only scan. This is pretty primitive and will no doubt see refinements once we've acquired more field experience with the index-only scan mechanism, but it's way better than using a constant. Note: I had to adjust an underspecified query in the window.sql regression test, because it was changing answers when the plan changed to use an index-only scan. Some of the adjacent tests perhaps should be adjusted as well, but I didn't do that here.	2011-10-14 17:23:46 -04:00
Tom Lane	e6faf910d7	Redesign the plancache mechanism for more flexibility and efficiency. Rewrite plancache.c so that a "cached plan" (which is rather a misnomer at this point) can support generation of custom, parameter-value-dependent plans, and can make an intelligent choice between using custom plans and the traditional generic-plan approach. The specific choice algorithm implemented here can probably be improved in future, but this commit is all about getting the mechanism in place, not the policy. In addition, restructure the API to greatly reduce the amount of extraneous data copying needed. The main compromise needed to make that possible was to split the initial creation of a CachedPlanSource into two steps. It's worth noting in particular that SPI_saveplan is now deprecated in favor of SPI_keepplan, which accomplishes the same end result with zero data copying, and no need to then spend even more cycles throwing away the original SPIPlan. The risk of long-term memory leaks while manipulating SPIPlans has also been greatly reduced. Most of this improvement is based on use of the recently-added MemoryContextSetParent primitive.	2011-09-16 00:43:52 -04:00
Tom Lane	a7801b62f2	Move Timestamp/Interval typedefs and basic macros into datatype/timestamp.h. As per my recent proposal, this refactors things so that these typedefs and macros are available in a header that can be included in frontend-ish code. I also changed various headers that were undesirably including utils/timestamp.h to include datatype/timestamp.h instead. Unsurprisingly, this showed that half the system was getting utils/timestamp.h by way of xlog.h. No actual code changes here, just header refactoring.	2011-09-09 13:23:41 -04:00
Tom Lane	1609797c25	Clean up the #include mess a little. walsender.h should depend on xlog.h, not vice versa. (Actually, the inclusion was circular until a couple hours ago, which was even sillier; but Bruce broke it in the expedient rather than logically correct direction.) Because of that poor decision, plus blind application of pgrminclude, we had a situation where half the system was depending on xlog.h to include such unrelated stuff as array.h and guc.h. Clean up the header inclusion, and manually revert a lot of what pgrminclude had done so things build again. This episode reinforces my feeling that pgrminclude should not be run without adult supervision. Inclusion changes in header files in particular need to be reviewed with great care. More generally, it'd be good if we had a clearer notion of module layering to dictate which headers can sanely include which others ... but that's a big task for another day.	2011-09-04 01:13:16 -04:00
Tom Lane	5b562644fe	Teach ANALYZE to clear pg_class.relhassubclass when appropriate. In the past, relhassubclass always remained true if a relation had ever had child relations, even if the last subclass was long gone. While this had only marginal performance implications in most cases, it was annoying, and I'm now considering some planner changes that would raise the cost of a false positive. It was previously impractical to fix this because of race condition concerns. However, given the recent change that made tablecmds.c take ShareExclusiveLock on relations that are gaining a child (commit `fbcf4b92aa`), we can now allow ANALYZE to clear the flag when it's no longer relevant. There is no additional locking cost to do so, since ANALYZE takes ShareExclusiveLock anyway.	2011-09-02 14:29:31 -04:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00
Bruce Momjian	2ab15afcdd	Add missing include so include file compiles cleanly on its own.	2011-08-22 23:19:21 -04:00
Robert Haas	463f2625a5	Support SECURITY LABEL on databases, tablespaces, and roles. This requires a new shared catalog, pg_shseclabel. Along the way, fix the security_label regression tests so that they don't monkey with the labels of any pre-existing objects. This is unlikely to matter in practice, since only the label for the "dummy" provider was being manipulated. But this way still seems cleaner. KaiGai Kohei, with fairly extensive hacking by me.	2011-07-20 13:18:24 -04:00
Robert Haas	367bc426a1	Avoid index rebuild for no-rewrite ALTER TABLE .. ALTER TYPE. Noah Misch. Review and minor cosmetic changes by me.	2011-07-18 11:04:43 -04:00
Robert Haas	4240e429d0	Try to acquire relation locks in RangeVarGetRelid. In the previous coding, we would look up a relation in RangeVarGetRelid, lock the resulting OID, and then AcceptInvalidationMessages(). While this was sufficient to ensure that we noticed any changes to the relation definition before building the relcache entry, it didn't handle the possibility that the name we looked up no longer referenced the same OID. This was particularly problematic in the case where a table had been dropped and recreated: we'd latch on to the entry for the old relation and fail later on. Now, we acquire the relation lock inside RangeVarGetRelid, and retry the name lookup if we notice that invalidation messages have been processed meanwhile. Many operations that would previously have failed with an error in the presence of concurrent DDL will now succeed. There is a good deal of work remaining to be done here: many callers of RangeVarGetRelid still pass NoLock for one reason or another. In addition, nothing in this patch guards against the possibility that the meaning of an unqualified name might change due to the creation of a relation in a schema earlier in the user's search path than the one where it was previously found. Furthermore, there's nothing at all here to guard against similar race conditions for non-relations. For all that, it's a start. Noah Misch and Robert Haas	2011-07-08 22:19:30 -04:00
Alvaro Herrera	897795240c	Enable CHECK constraints to be declared NOT VALID This means that they can initially be added to a large existing table without checking its initial contents, but new tuples must comply to them; a separate pass invoked by ALTER TABLE / VALIDATE can verify existing data and ensure it complies with the constraint, at which point it is marked validated and becomes a normal part of the table ecosystem. An non-validated CHECK constraint is ignored in the planner for constraint_exclusion purposes; when validated, cached plans are recomputed so that partitioning starts working right away. This patch also enables domains to have unvalidated CHECK constraints attached to them as well by way of ALTER DOMAIN / ADD CONSTRAINT / NOT VALID, which can later be validated with ALTER DOMAIN / VALIDATE CONSTRAINT. Thanks to Thom Brown, Dean Rasheed and Jaime Casanova for the various reviews, and Robert Hass for documentation wording improvement suggestions. This patch was sponsored by Enova Financial.	2011-06-30 11:24:31 -04:00
Tom Lane	b4b6923e03	Fix VACUUM so that it always updates pg_class.reltuples/relpages. When we added the ability for vacuum to skip heap pages by consulting the visibility map, we made it just not update the reltuples/relpages statistics if it skipped any pages. But this could leave us with extremely out-of-date stats for a table that contains any unchanging areas, especially for TOAST tables which never get processed by ANALYZE. In particular this could result in autovacuum making poor decisions about when to process the table, as in recent report from Florian Helmberger. And in general it's a bad idea to not update the stats at all. Instead, use the previous values of reltuples/relpages as an estimate of the tuple density in unvisited pages. This approach results in a "moving average" estimate of reltuples, which should converge to the correct value over multiple VACUUM and ANALYZE cycles even when individual measurements aren't very good. This new method for updating reltuples is used by both VACUUM and ANALYZE, with the result that we no longer need the grotty interconnections that caused ANALYZE to not update the stats depending on what had happened in the parent VACUUM command. Also, fix the logic for skipping all-visible pages during VACUUM so that it looks ahead rather than behind to decide what to do, as per a suggestion from Greg Stark. This eliminates useless scanning of all-visible pages at the start of the relation or just after a not-all-visible page. In particular, the first few pages of the relation will not be invariably included in the scanned pages, which seems to help in not overweighting them in the reltuples estimate. Back-patch to 8.4, where the visibility map was introduced.	2011-05-30 17:06:52 -04:00
Robert Haas	68739ba856	Allow ALTER TABLE name {OF type \| NOT OF}. This syntax allows a standalone table to be made into a typed table, or a typed table to be made standalone. This is possibly a mildly useful feature in its own right, but the real motivation for this change is that we need it to make pg_upgrade work with typed tables. This doesn't actually fix that problem, but it's necessary infrastructure. Noah Misch	2011-04-20 21:38:47 -04:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Tom Lane	2594cf0e8c	Revise the API for GUC variable assign hooks. The previous functions of assign hooks are now split between check hooks and assign hooks, where the former can fail but the latter shouldn't. Aside from being conceptually clearer, this approach exposes the "canonicalized" form of the variable value to guc.c without having to do an actual assignment. And that lets us fix the problem recently noted by Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus log messages about "parameter "wal_buffers" cannot be changed without restarting the server". There may be some speed advantage too, because this design lets hook functions avoid re-parsing variable values when restoring a previous state after a rollback (they can store a pre-parsed representation of the value instead). This patch also resolves a longstanding annoyance about custom error messages from variable assign hooks: they should modify, not appear separately from, guc.c's own message about "invalid parameter value".	2011-04-07 00:12:02 -04:00
Tom Lane	696d1f7f06	Make all comparisons done for/with statistics use the default collation. While this will give wrong answers when estimating selectivity for a comparison operator that's using a non-default collation, the estimation error probably won't be large; and anyway the former approach created estimation errors of its own by trying to use a histogram that might have been computed with some other collation. So we'll adopt this simplified approach for now and perhaps improve it sometime in the future. This patch incorporates changes from Andres Freund to make sure that selfuncs.c passes a valid collation OID to any datatype-specific function it calls, in case that function wants collation information. Said OID will now always be DEFAULT_COLLATION_OID, but at least we won't get errors.	2011-03-12 16:30:36 -05:00
Tom Lane	a210be7720	Fix dangling-pointer problem in before-row update trigger processing. ExecUpdate checked for whether ExecBRUpdateTriggers had returned a new tuple value by seeing if the returned tuple was pointer-equal to the old one. But the "old one" was in estate->es_junkFilter's result slot, which would be scribbled on if we had done an EvalPlanQual update in response to a concurrent update of the target tuple; therefore we were comparing a dangling pointer to a live one. Given the right set of circumstances we could get a false match, resulting in not forcing the tuple to be stored in the slot we thought it was stored in. In the case reported by Maxim Boguk in bug #5798, this led to "cannot extract system attribute from virtual tuple" failures when trying to do "RETURNING ctid". I believe there is a very-low-probability chance of more serious errors, such as generating incorrect index entries based on the original rather than the trigger-modified version of the row. In HEAD, change all of ExecBRInsertTriggers, ExecIRInsertTriggers, ExecBRUpdateTriggers, and ExecIRUpdateTriggers so that they continue to have similar APIs. In the back branches I just changed ExecBRUpdateTriggers, since there is no bug in the ExecBRInsertTriggers case.	2011-02-21 21:19:50 -05:00
Tom Lane	7c5d0ae707	Add contrib/file_fdw foreign-data wrapper for reading files via COPY. This is both very useful in its own right, and an important test case for the core FDW support. This commit includes a small refactoring of copy.c to expose its option checking code as a separately callable function. The original patch submission duplicated hundreds of lines of that code, which seemed pretty unmaintainable. Shigeru Hanada, reviewed by Itagaki Takahiro and Tom Lane	2011-02-20 14:06:59 -05:00
Tom Lane	bb74240794	Implement an API to let foreign-data wrappers actually be functional. This commit provides the core code and documentation needed. A contrib module test case will follow shortly. Shigeru Hanada, Jan Urbanski, Heikki Linnakangas	2011-02-20 00:18:14 -05:00
Itagaki Takahiro	8ddc05fb01	Export the external file reader used in COPY FROM as APIs. They are expected to be used by extension modules like file_fdw. There are no user-visible changes. Itagaki Takahiro Reviewed and tested by Kevin Grittner and Noah Misch.	2011-02-16 11:19:11 +09:00
Peter Eisentraut	b313bca0af	DDL support for collations - collowner field - CREATE COLLATION - ALTER COLLATION - DROP COLLATION - COMMENT ON COLLATION - integration with extensions - pg_dump support for the above - dependency management - psql tab completion - psql \dO command	2011-02-12 15:55:18 +02:00
Tom Lane	1214749901	Add support for multiple versions of an extension and ALTER EXTENSION UPDATE. This follows recent discussions, so it's quite a bit different from Dimitri's original. There will probably be more changes once we get a bit of experience with it, but let's get it in and start playing with it. This is still just core code. I'll start converting contrib modules shortly. Dimitri Fontaine and Tom Lane	2011-02-11 21:25:57 -05:00
Robert Haas	2c20ba1fd2	Tweak find_composite_type_dependencies API a bit more. Per discussion with Noah Misch, the previous coding, introduced by my commit `65377e0b9c` on 2011-02-06, was really an abuse of RELKIND_COMPOSITE_TYPE, since the caller in typecmds.c is actually passing the name of a domain. So go back having a type name argument, but make the first argument a Relation rather than just a string so we can tell whether it's a table or a foreign table and emit the proper error message.	2011-02-11 08:47:38 -05:00
Tom Lane	01467d3e4f	Extend "ALTER EXTENSION ADD object" to permit "DROP object" as well. Per discussion, this is something we should have sooner rather than later, and it doesn't take much additional code to support it.	2011-02-10 17:37:22 -05:00
Tom Lane	caddcb8f4b	Fix pg_upgrade to handle extensions. This follows my proposal of yesterday, namely that we try to recreate the previous state of the extension exactly, instead of allowing CREATE EXTENSION to run a SQL script that might create some entirely-incompatible on-disk state. In --binary-upgrade mode, pg_dump won't issue CREATE EXTENSION at all, but instead uses a kluge function provided by pg_upgrade_support to recreate the pg_extension row (and extension-level pg_depend entries) without creating any member objects. The member objects are then restored in the same way as if they weren't members, in particular using pg_upgrade's normal hacks to preserve OIDs that need to be preserved. Then, for each member object, ALTER EXTENSION ADD is issued to recreate the pg_depend entry that marks it as an extension member. In passing, fix breakage in pg_upgrade's enum-type support: somebody didn't fix it when the noise word VALUE got added to ALTER TYPE ADD. Also, rationalize parsetree representation of COMMENT ON DOMAIN and fix get_object_address() to allow OBJECT_DOMAIN.	2011-02-09 19:18:08 -05:00
Tom Lane	5bc178b89f	Implement "ALTER EXTENSION ADD object". This is an essential component of making the extension feature usable; first because it's needed in the process of converting an existing installation containing "loose" objects of an old contrib module into the extension-based world, and second because we'll have to use it in pg_dump --binary-upgrade, as per recent discussion. Loosely based on part of Dimitri Fontaine's ALTER EXTENSION UPGRADE patch.	2011-02-09 11:56:37 -05:00
Tom Lane	d9572c4e3b	Core support for "extensions", which are packages of SQL objects. This patch adds the server infrastructure to support extensions. There is still one significant loose end, namely how to make it play nice with pg_upgrade, so I am not yet committing the changes that would make all the contrib modules depend on this feature. In passing, fix a disturbingly large amount of breakage in AlterObjectNamespace() and callers. Dimitri Fontaine, reviewed by Anssi Kääriäinen, Itagaki Takahiro, Tom Lane, and numerous others	2011-02-08 16:13:22 -05:00
Peter Eisentraut	414c5a2ea6	Per-column collation support This adds collation support for columns and domains, a COLLATE clause to override it per expression, and B-tree index support. Peter Eisentraut reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch	2011-02-08 23:04:18 +02:00
Heikki Linnakangas	dafaa3efb7	Implement genuine serializable isolation level. Until now, our Serializable mode has in fact been what's called Snapshot Isolation, which allows some anomalies that could not occur in any serialized ordering of the transactions. This patch fixes that using a method called Serializable Snapshot Isolation, based on research papers by Michael J. Cahill (see README-SSI for full references). In Serializable Snapshot Isolation, transactions run like they do in Snapshot Isolation, but a predicate lock manager observes the reads and writes performed and aborts transactions if it detects that an anomaly might occur. This method produces some false positives, ie. it sometimes aborts transactions even though there is no anomaly. To track reads we implement predicate locking, see storage/lmgr/predicate.c. Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared memory is finite, so when a transaction takes many tuple-level locks on a page, the locks are promoted to a single page-level lock, and further to a single relation level lock if necessary. To lock key values with no matching tuple, a sequential scan always takes a relation-level lock, and an index scan acquires a page-level lock that covers the search key, whether or not there are any matching keys at the moment. A predicate lock doesn't conflict with any regular locks or with another predicate locks in the normal sense. They're only used by the predicate lock manager to detect the danger of anomalies. Only serializable transactions participate in predicate locking, so there should be no extra overhead for for other transactions. Predicate locks can't be released at commit, but must be remembered until all the transactions that overlapped with it have completed. That means that we need to remember an unbounded amount of predicate locks, so we apply a lossy but conservative method of tracking locks for committed transactions. If we run short of shared memory, we overflow to a new "pg_serial" SLRU pool. We don't currently allow Serializable transactions in Hot Standby mode. That would be hard, because even read-only transactions can cause anomalies that wouldn't otherwise occur. Serializable isolation mode now means the new fully serializable level. Repeatable Read gives you the old Snapshot Isolation level that we have always had. Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and Anssi Kääriäinen	2011-02-08 00:09:08 +02:00
Robert Haas	65377e0b9c	Tighten ALTER FOREIGN TABLE .. SET DATA TYPE checks. If the foreign table's rowtype is being used as the type of a column in another table, we can't just up and change its data type. This was already checked for composite types and ordinary tables, but we previously failed to enforce it for foreign tables.	2011-02-06 00:26:27 -05:00
Robert Haas	6f59777c65	Code cleanup for assign_transaction_read_only. As in commit `fb4c5d2798` on 2011-01-21, this avoids spurious debug messages and allows idempotent changes at any time. Along the way, make assign_XactIsoLevel allow idempotent changes even when not within a subtransaction, to be consistent with the new coding of assign_transaction_read_only and because there's no compelling reason to do otherwise. Kevin Grittner, with some adjustments.	2011-01-22 20:55:50 -05:00
Robert Haas	8ceb245680	Make ALTER TABLE revalidate uniqueness and exclusion constraints. Failure to do so can lead to constraint violations. This was broken by commit `1ddc2703a9` on 2010-02-07, so back-patch to 9.0. Noah Misch. Regression test by me.	2011-01-20 22:44:10 -05:00
Tom Lane	6ca452ba7f	Move a couple of declarations to reflect where the routines really are.	2011-01-15 16:09:05 -05:00
Robert Haas	7f60be72b0	Fix crash in ALTER OPERATOR CLASS/FAMILY .. SET SCHEMA. In the previous coding, the parser emitted a List containing a C string, which is no good, because copyObject() can't handle it. Dimitri Fontaine	2011-01-03 22:08:55 -05:00
Peter Eisentraut	39b8843296	Implement remaining fields of information_schema.sequences view Add new function pg_sequence_parameters that returns a sequence's start, minimum, maximum, increment, and cycle values, and use that in the view. (bug #5662; design suggestion by Tom Lane) Also slightly adjust the view's column order and permissions after review of SQL standard.	2011-01-02 15:15:21 +02:00
Robert Haas	0d692a0dc9	Basic foreign table support. Foreign tables are a core component of SQL/MED. This commit does not provide a working SQL/MED infrastructure, because foreign tables cannot yet be queried. Support for foreign table scans will need to be added in a future patch. However, this patch creates the necessary system catalog structure, syntax support, and support for ancillary operations such as COMMENT and SECURITY LABEL. Shigeru Hanada, heavily revised by Robert Haas	2011-01-01 23:48:11 -05:00
Bruce Momjian	5d950e3b0c	Stamp copyrights for year 2011.	2011-01-01 13:18:15 -05:00
Robert Haas	5f7b58fad8	Generalize concept of temporary relations to "relation persistence". This commit replaces pg_class.relistemp with pg_class.relpersistence; and also modifies the RangeVar node type to carry relpersistence rather than istemp. It also removes removes rd_istemp from RelationData and instead performs the correct computation based on relpersistence. For clarity, we add three new macros: RelationNeedsWAL(), RelationUsesLocalBuffers(), and RelationUsesTempNamespace(), so that we can clarify the purpose of each check that previous depended on rd_istemp. This is intended as infrastructure for the upcoming unlogged tables patch, as well as for future possible work on global temporary tables.	2010-12-13 12:34:26 -05:00
Robert Haas	55109313f9	Add more ALTER <object> .. SET SCHEMA commands. This adds support for changing the schema of a conversion, operator, operator class, operator family, text search configuration, text search dictionary, text search parser, or text search template. Dimitri Fontaine, with assorted corrections and other kibitzing.	2010-11-26 17:31:54 -05:00
Peter Eisentraut	f2a4278330	Propagate ALTER TYPE operations to typed tables This adds RESTRICT/CASCADE flags to ALTER TYPE ... ADD/DROP/ALTER/ RENAME ATTRIBUTE to control whether to alter typed tables as well.	2010-11-23 22:50:17 +02:00
Tom Lane	511e902b51	Make TRUNCATE ... RESTART IDENTITY restart sequences transactionally. In the previous coding, we simply issued ALTER SEQUENCE RESTART commands, which do not roll back on error. This meant that an error between truncating and committing left the sequences out of sync with the table contents, with potentially bad consequences as were noted in a Warning on the TRUNCATE man page. To fix, create a new storage file (relfilenode) for a sequence that is to be reset due to RESTART IDENTITY. If the transaction aborts, we'll automatically revert to the old storage file. This acts just like a rewriting ALTER TABLE operation. A penalty is that we have to take exclusive lock on the sequence, but since we've already got exclusive lock on its owning table, that seems unlikely to be much of a problem. The interaction of this with usual nontransactional behaviors of sequence operations is a bit weird, but it's hard to see what would be completely consistent. Our choice is to discard cached-but-unissued sequence values both when the RESTART is executed, and at rollback if any; but to not touch the currval() state either time. In passing, move the sequence reset operations to happen before not after any AFTER TRUNCATE triggers are fired. The previous ordering was not logically sensible, but was forced by the need to minimize inconsistency if the triggers caused an error. Transactional rollback is a much better solution to that. Patch by Steve Singer, rather heavily adjusted by me.	2010-11-17 16:42:18 -05:00
Tom Lane	84c123be1d	Allow new values to be added to an existing enum type. After much expenditure of effort, we've got this to the point where the performance penalty is pretty minimal in typical cases. Andrew Dunstan, reviewed by Brendan Jurd, Dean Rasheed, and Tom Lane	2010-10-24 23:05:41 -04:00
Tom Lane	2ec993a7cb	Support triggers on views. This patch adds the SQL-standard concept of an INSTEAD OF trigger, which is fired instead of performing a physical insert/update/delete. The trigger function is passed the entire old and/or new rows of the view, and must figure out what to do to the underlying tables to implement the update. So this feature can be used to implement updatable views using trigger programming style rather than rule hacking. In passing, this patch corrects the names of some columns in the information_schema.triggers view. It seems the SQL committee renamed them somewhere between SQL:99 and SQL:2003. Dean Rasheed, reviewed by Bernd Helmle; some additional hacking by me.	2010-10-10 13:45:07 -04:00
Robert Haas	4d355a8336	Add a SECURITY LABEL command. This is intended as infrastructure to support integration with label-based mandatory access control systems such as SE-Linux. Further changes (mostly hooks) will be needed, but this is a big chunk of it. KaiGai Kohei and Robert Haas	2010-09-27 20:55:27 -04:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Tom Lane	b5565bca11	Fix failure of "ALTER TABLE t ADD COLUMN c serial" when done by non-owner. The implicitly created sequence was created as owned by the current user, who could be different from the table owner, eg if current user is a superuser or some member of the table's owning role. This caused sanity checks in the SEQUENCE OWNED BY code to spit up. Although possibly we don't need those sanity checks, the safest fix seems to be to make sure the implicit sequence is assigned the same owner role as the table has. (We still do all permissions checks as the current user, however.) Per report from Josh Berkus. Back-patch to 9.0. The bug goes back to the invention of SEQUENCE OWNED BY in 8.2, but the fix requires an API change for DefineRelation(), which seems to have potential for breaking third-party code if done in a minor release. Given the lack of prior complaints, it's probably not worth fixing in the stable branches.	2010-08-18 18:35:21 +00:00
Robert Haas	fd1843ff89	Standardize get_whatever_oid functions for other object types. - Rename TSParserGetPrsid to get_ts_parser_oid. - Rename TSDictionaryGetDictid to get_ts_dict_oid. - Rename TSTemplateGetTmplid to get_ts_template_oid. - Rename TSConfigGetCfgid to get_ts_config_oid. - Rename FindConversionByName to get_conversion_oid. - Rename GetConstraintName to get_constraint_oid. - Add new functions get_opclass_oid, get_opfamily_oid, get_rewrite_oid, get_rewrite_oid_without_relid, get_trigger_oid, and get_cast_oid. The name of each function matches the corresponding catalog. Thanks to KaiGai Kohei for the review.	2010-08-05 15:25:36 +00:00
Robert Haas	2a6ef3445c	Standardize get_whatever_oid functions for object types with unqualified names. - Add a missing_ok parameter to get_tablespace_oid. - Avoid duplicating get_tablespace_od guts in objectNamesToOids. - Add a missing_ok parameter to get_database_oid. - Replace get_roleid and get_role_checked with get_role_oid. - Add get_namespace_oid, get_language_oid, get_am_oid. - Refactor existing code to use new interfaces. Thanks to KaiGai Kohei for the review.	2010-08-05 14:45:09 +00:00
Tom Lane	67becf8d41	Fix ANALYZE's ancient deficiency of not trying to collect stats for expression indexes when the index column type (the opclass opckeytype) is different from the expression's datatype. When coded, this limitation wasn't worth worrying about because we had no intelligence to speak of in stats collection for the datatypes used by such opclasses. However, now that there's non-toy estimation capability for tsvector queries, it amounts to a bug that ANALYZE fails to do this. The fix changes struct VacAttrStats, and therefore constitutes an API break for custom typanalyze functions. Therefore we can't back-patch it into released branches, but it was agreed that 9.0 isn't yet frozen hard enough to make such a change unacceptable. Ergo, back-patch to 9.0 but no further. The API break had better be mentioned in 9.0 release notes.	2010-08-01 22:38:11 +00:00
Simon Riggs	04e17bae50	Add explicit regression tests for ALTER TABLE lock levels. Use this to catch a couple of lock level assignments that slipped through manual testing, per Peter Eisentraut.	2010-07-29 11:06:34 +00:00
Simon Riggs	2dbbda02e7	Reduce lock levels of CREATE TRIGGER and some ALTER TABLE, CREATE RULE actions. Avoid hard-coding lockmode used for many altering DDL commands, allowing easier future changes of lock levels. Implementation of initial analysis on DDL sub-commands, so that many lock levels are now at ShareUpdateExclusiveLock or ShareRowExclusiveLock, allowing certain DDL not to block reads/writes. First of number of planned changes in this area; additional docs required when full project complete.	2010-07-28 05:22:24 +00:00
Robert Haas	b3b7d603fb	Allow REASSIGNED OWNED to handle opclasses and opfamilies. Backpatch to 8.3, which is as far back as we have opfamilies. The opclass portion could probably be backpatched to 8.2, when REASSIGN OWNED was added, but for now I have not done that. Asko Tiidumaa, with minor adjustments by me.	2010-07-03 13:53:13 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Tom Lane	d1e027221d	Replace the pg_listener-based LISTEN/NOTIFY mechanism with an in-memory queue. In addition, add support for a "payload" string to be passed along with each notify event. This implementation should be significantly more efficient than the old one, and is also more compatible with Hot Standby usage. There is not yet any facility for HS slaves to receive notifications generated on the master, although such a thing is possible in future. Joachim Wieland, reviewed by Jeff Davis; also hacked on by me.	2010-02-16 22:34:57 +00:00
Andrew Dunstan	fc5173ad51	Add query text to auto_explain output. Still to be done: fix docs and fix regression failures under auto_explain.	2010-02-16 22:19:59 +00:00
Tom Lane	cbe9d6beb4	Fix up rickety handling of relation-truncation interlocks. Move rd_targblock, rd_fsm_nblocks, and rd_vm_nblocks from relcache to the smgr relation entries, so that they will get reset to InvalidBlockNumber whenever an smgr-level flush happens. Because we now send smgr invalidation messages immediately (not at end of transaction) when a relation truncation occurs, this ensures that other backends will reset their values before they next access the relation. We no longer need the unreliable assumption that a VACUUM that's doing a truncation will hold its AccessExclusive lock until commit --- in fact, we can intentionally release that lock as soon as we've completed the truncation. This patch therefore reverts (most of) Alvaro's patch of 2009-11-10, as well as my marginal hacking on it yesterday. We can also get rid of assorted no-longer-needed relcache flushes, which are far more expensive than an smgr flush because they kill a lot more state. In passing this patch fixes smgr_redo's failure to perform visibility-map truncation, and cleans up some rather dubious assumptions in freespace.c and visibilitymap.c about when rd_fsm_nblocks and rd_vm_nblocks can be out of date.	2010-02-09 21:43:30 +00:00
Tom Lane	0a469c8769	Remove old-style VACUUM FULL (which was known for a little while as VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity. Per discussion, the use case for this method of vacuuming is no longer large enough to justify maintaining it; not to mention that we don't wish to invest the work that would be needed to make it play nicely with Hot Standby. Aside from the code directly related to old-style VACUUM FULL, this commit removes support for certain WAL record types that could only be generated within VACUUM FULL, redirect-pointer removal in heap_page_prune, and nontransactional generation of cache invalidation sinval messages (the last being the sticking point for Hot Standby). We still have to retain all code that copes with finding HEAP_MOVED_OFF and HEAP_MOVED_IN flag bits on existing tuples. This can't be removed as long as we want to support in-place update from pre-9.0 databases.	2010-02-08 04:33:55 +00:00
Tom Lane	b9b8831ad6	Create a "relation mapping" infrastructure to support changing the relfilenodes of shared or nailed system catalogs. This has two key benefits: * The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs. * We no longer have to use an unsafe reindex-in-place approach for reindexing shared catalogs. CLUSTER on nailed catalogs now works too, although I left it disabled on shared catalogs because the resulting pg_index.indisclustered update would only be visible in one database. Since reindexing shared system catalogs is now fully transactional and crash-safe, the former special cases in REINDEX behavior have been removed; shared catalogs are treated the same as non-shared. This commit does not do anything about the recently-discussed problem of deadlocks between VACUUM FULL/CLUSTER on a system catalog and other concurrent queries; will address that in a separate patch. As a stopgap, parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid such failures during the regression tests.	2010-02-07 20:48:13 +00:00
Tom Lane	9727c583fe	Restructure CLUSTER/newstyle VACUUM FULL/ALTER TABLE support so that swapping of old and new toast tables can be done either at the logical level (by swapping the heaps' reltoastrelid links) or at the physical level (by swapping the relfilenodes of the toast tables and their indexes). This is necessary infrastructure for upcoming changes to support CLUSTER/VAC FULL on shared system catalogs, where we cannot change reltoastrelid. The physical swap saves a few catalog updates too. We unfortunately have to keep the logical-level swap logic because in some cases we will be adding or deleting a toast table, so there's no possibility of a physical swap. However, that only happens as a consequence of schema changes in the table, which we do not need to support for system catalogs, so such cases aren't an obstacle for that. In passing, refactor the cluster support functions a little bit to eliminate unnecessarily-duplicated code; and fix the problem that while CLUSTER had been taught to rename the final toast table at need, ALTER TABLE had not.	2010-02-04 00:09:14 +00:00
Robert Haas	63f9282f6e	Tighten integrity checks on ALTER TABLE ... ALTER COLUMN ... RENAME. When a column is renamed, we recursively rename the same column in all descendent tables. But if one of those tables also inherits that column from a table outside the inheritance hierarchy rooted at the named table, we must throw an error. The previous coding correctly prohibited the rename when the parent had inherited the column from elsewhere, but overlooked the case where the parent was OK but a child table also inherited the same column from a second, unrelated parent. For now, not backpatched due to lack of complaints from the field. KaiGai Kohei, with further changes by me. Reviewed by Bernd Helme and Tom Lane.	2010-02-01 19:28:56 +00:00
Tom Lane	9a915e596f	Improve the handling of SET CONSTRAINTS commands by having them search pg_constraint before searching pg_trigger. This allows saner handling of corner cases; in particular we now say "constraint is not deferrable" rather than "constraint does not exist" when the command is applied to a constraint that's inherently non-deferrable. Per a gripe several months ago from hubert depesz lubaczewski. To make this work without breaking user-defined constraint triggers, we have to add entries for them to pg_constraint. However, in return we can remove the pgconstrname column from pg_constraint, which represents a fairly sizable space savings. I also replaced the tgisconstraint column with tgisinternal; the old meaning of tgisconstraint can now be had by testing for nonzero tgconstraint, while there is no other way to get the old meaning of nonzero tgconstraint, namely that the trigger was internally generated rather than being user-created. In passing, fix an old misstatement in the docs and comments, namely that pg_trigger.tgdeferrable is exactly redundant with pg_constraint.condeferrable. Actually, we mark RI action triggers as nondeferrable even when they belong to a nominally deferrable FK constraint. The SET CONSTRAINTS code now relies on that instead of hard-coding a list of exception OIDs.	2010-01-17 22:56:23 +00:00
Tom Lane	901be0fad4	Remove all the special-case code for INT64_IS_BUSTED, per decision that we're not going to support that anymore. I did keep the 64-bit-CRC-with-32-bit-arithmetic code, since it has a performance excuse to live. It's a bit moot since that's all ifdef'd out, of course.	2010-01-07 04:53:35 +00:00
Robert Haas	814c8a03ba	Further fixes for per-tablespace options patch. Add missing varlena header to TableSpaceOpts structure. And, per Tom Lane, instead of calling tablespace_reloptions in CacheMemoryContext, call it in the caller's memory context and copy the value over afterwards, to reduce the chances of a session-lifetime memory leak.	2010-01-07 03:53:08 +00:00
Itagaki Takahiro	946cf229e8	Support rewritten-based full vacuum as VACUUM FULL. Traditional VACUUM FULL was renamed to VACUUM FULL INPLACE. Also added a new option -i, --inplace for vacuumdb to perform FULL INPLACE vacuuming. Since the new VACUUM FULL uses CLUSTER infrastructure, we cannot use it for system tables. VACUUM FULL for system tables always fall back into VACUUM FULL INPLACE silently. Itagaki Takahiro, reviewed by Jeff Davis and Simon Riggs.	2010-01-06 05:31:14 +00:00
Robert Haas	d86d51a958	Support ALTER TABLESPACE name SET/RESET ( tablespace_options ). This patch only supports seq_page_cost and random_page_cost as parameters, but it provides the infrastructure to scalably support many more. In particular, we may want to add support for effective_io_concurrency, but I'm leaving that as future work for now. Thanks to Tom Lane for design help and Alvaro Herrera for the review.	2010-01-05 21:54:00 +00:00
Bruce Momjian	0239800893	Update copyright for the year 2010.	2010-01-02 16:58:17 +00:00
Bruce Momjian	c44327afa4	Binary upgrade: Modify pg_dump --binary-upgrade and add backend support routines to support the preservation of pg_type oids when doing a binary upgrade. This allows user-defined composite types and arrays to be binary upgraded.	2009-12-24 22:09:24 +00:00
Tom Lane	cfc5008a51	Adjust naming of indexes and their columns per recent discussion. Index expression columns are now named after the FigureColname result for their expressions, rather than always being "pg_expression_N". Digits are appended to this name if needed to make the column name unique within the index. (That happens for regular columns too, thus fixing the old problem that CREATE INDEX fooi ON foo (f1, f1) fails. Before exclusion indexes there was no real reason to do such a thing, but now maybe there is.) Default names for indexes and associated constraints now include the column names of all their columns, not only the first one as in previous practice. (Of course, this will be truncated as needed to fit in NAMEDATALEN. Also, pkey indexes retain the historical behavior of not naming specific columns at all.) An example of the results: regression=# create table foo (f1 int, f2 text, regression(# exclude (f1 with =, lower(f2) with =)); NOTICE: CREATE TABLE / EXCLUDE will create implicit index "foo_f1_lower_exclusion" for table "foo" CREATE TABLE regression=# \d foo_f1_lower_exclusion Index "public.foo_f1_lower_exclusion" Column \| Type \| Definition --------+---------+------------ f1 \| integer \| f1 lower \| text \| lower(f2) btree, for table "public.foo"	2009-12-23 02:35:25 +00:00
Robert Haas	cddca5ec13	Add an EXPLAIN (BUFFERS) option to show buffer-usage statistics. This patch also removes buffer-usage statistics from the track_counts output, since this (or the global server statistics) is deemed to be a better interface to this information. Itagaki Takahiro, reviewed by Euler Taveira de Oliveira.	2009-12-15 04:57:48 +00:00
Robert Haas	02490d4692	Export ExplainBeginOutput() and ExplainEndOutput() for auto_explain. Without these functions, anyone outside of explain.c can't actually use ExplainPrintPlan, because the ExplainState won't be initialized properly. The user-visible result of this was a crash when using auto_explain with the JSON output format. Report by Euler Taveira de Oliveira. Analysis by Tom Lane. Patch by me.	2009-12-12 00:35:34 +00:00
Andrew Dunstan	324385d67f	Add YAML to list of EXPLAIN formats. Greg Sabino Mullane, reviewed by Takahiro Itagaki.	2009-12-11 01:33:35 +00:00
Tom Lane	0cb65564e5	Add exclusion constraints, which generalize the concept of uniqueness to support any indexable commutative operator, not just equality. Two rows violate the exclusion constraint if "row1.col OP row2.col" is TRUE for each of the columns in the constraint. Jeff Davis, reviewed by Robert Haas	2009-12-07 05:22:23 +00:00
Tom Lane	7fc0f06221	Add a WHEN clause to CREATE TRIGGER, allowing a boolean expression to be checked to determine whether the trigger should be fired. For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER triggers it can provide a noticeable performance improvement, since queuing of a deferred trigger event and re-fetching of the row(s) at end of statement can be short-circuited if the trigger does not need to be fired. Takahiro Itagaki, reviewed by KaiGai Kohei.	2009-11-20 20:38:12 +00:00
Tom Lane	c742b795dd	Add a hook to CREATE/ALTER ROLE to allow an external module to check the strength of database passwords, and create a sample implementation of such a hook as a new contrib module "passwordcheck". Laurenz Albe, reviewed by Takahiro Itagaki	2009-11-18 21:57:56 +00:00
Alvaro Herrera	e7ec022266	Fix longstanding problems in VACUUM caused by untimely interruptions In VACUUM FULL, an interrupt after the initial transaction has been recorded as committed can cause postmaster to restart with the following error message: PANIC: cannot abort transaction NNNN, it was already committed This problem has been reported many times. In lazy VACUUM, an interrupt after the table has been truncated by lazy_truncate_heap causes other backends' relcache to still point to the removed pages; this can cause future INSERT and UPDATE queries to error out with the following error message: could not read block XX of relation 1663/NNN/MMMM: read only 0 of 8192 bytes The window to this race condition is extremely narrow, but it has been seen in the wild involving a cancelled autovacuum process. The solution for both problems is to inhibit interrupts in both operations until after the respective transactions have been committed. It's not a complete solution, because the transaction could theoretically be aborted by some other error, but at least fixes the most common causes of both problems.	2009-11-10 18:00:06 +00:00
Tom Lane	9f2ee8f287	Re-implement EvalPlanQual processing to improve its performance and eliminate a lot of strange behaviors that occurred in join cases. We now identify the "current" row for every joined relation in UPDATE, DELETE, and SELECT FOR UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the appropriate row into each scan node in the rechecking plan, forcing it to emit only that one row. The former behavior could rescan the whole of each joined relation for each recheck, which was terrible for performance, and what's much worse could result in duplicated output tuples. Also, the original implementation of EvalPlanQual could not re-use the recheck execution tree --- it had to go through a full executor init and shutdown for every row to be tested. To avoid this overhead, I've associated a special runtime Param with each LockRows or ModifyTable plan node, and arranged to make every scan node below such a node depend on that Param. Thus, by signaling a change in that Param, the EPQ machinery can just rescan the already-built test plan. This patch also adds a prohibition on set-returning functions in the targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the duplicate-output-tuple problem. It seems fairly reasonable since the other restrictions on SELECT FOR UPDATE are meant to ensure that there is a unique correspondence between source tuples and result tuples, which an output SRF destroys as much as anything else does.	2009-10-26 02:26:45 +00:00
Andrew Dunstan	faa1afc6c1	CREATE LIKE INCLUDING COMMENTS and STORAGE, and INCLUDING ALL shortcut. Itagaki Takahiro.	2009-10-12 19:49:24 +00:00
Tom Lane	8a5849b7ff	Split the processing of INSERT/UPDATE/DELETE operations out of execMain.c. They are now handled by a new plan node type called ModifyTable, which is placed at the top of the plan tree. In itself this change doesn't do much, except perhaps make the handling of RETURNING lists and inherited UPDATEs a tad less klugy. But it is necessary preparation for the intended extension of allowing RETURNING queries inside WITH. Marko Tiikkaja	2009-10-10 01:43:50 +00:00
Tom Lane	9048b73184	Implement the DO statement to support execution of PL code without having to create a function for it. Procedural languages now have an additional entry point, namely a function to execute an inline code block. This seemed a better design than trying to hide the transient-ness of the code from the PL. As of this patch, only plpgsql has an inline handler, but probably people will soon write handlers for the other standard PLs. In passing, remove the long-dead LANCOMPILER option of CREATE LANGUAGE. Petr Jelinek	2009-09-22 23:43:43 +00:00
Tom Lane	9bd27b7c9e	Extend EXPLAIN to support output in XML or JSON format. There are probably still some adjustments to be made in the details of the output, but this gets the basic structure in place. Robert Haas	2009-08-10 05:46:50 +00:00
Tom Lane	3783c9d4df	Remove long-since-unused file commands/version.h. Noticed by Itagaki Takahiro.	2009-08-07 16:19:57 +00:00
Tom Lane	2487d872e0	Create a multiplexing structure for signals to Postgres child processes. This patch gets us out from under the Unix limitation of two user-defined signal types. We already had done something similar for signals directed to the postmaster process; this adds multiplexing for signals directed to backends and auxiliary processes (so long as they're connected to shared memory). As proof of concept, replace the former usage of SIGUSR1 and SIGUSR2 for backends with use of the multiplexing mechanism. There are still some hard-wired definitions of SIGUSR1 and SIGUSR2 for other process types, but getting rid of those doesn't seem interesting at the moment. Fujii Masao	2009-07-31 20:26:23 +00:00
Tom Lane	25d9bf2e3e	Support deferrable uniqueness constraints. The current implementation fires an AFTER ROW trigger for each tuple that looks like it might be non-unique according to the index contents at the time of insertion. This works well as long as there aren't many conflicts, but won't scale to massive unique-key reassignments. Improving that case is a TODO item. Dean Rasheed	2009-07-29 20:56:21 +00:00
Tom Lane	c1b9ec24ef	Add system catalog columns pg_constraint.conindid and pg_trigger.tgconstrindid. conindid is the index supporting a constraint. We can use this not only for unique/primary-key constraints, but also foreign-key constraints, which depend on the unique index that constrains the referenced columns. tgconstrindid is just copied from the constraint's conindid field, or is zero for triggers not associated with constraints. This is mainly intended as infrastructure for upcoming patches, but it has some virtue in itself, since it exposes a relationship that you formerly had to grovel in pg_depend to determine. I simplified one information_schema view accordingly. (There is a pg_dump query that could also use conindid, but I left it alone because it wasn't clear it'd get any faster.)	2009-07-28 02:56:31 +00:00
Tom Lane	d4382c4ae7	Extend EXPLAIN to allow generic options to be specified. The original syntax made it difficult to add options without making them into reserved words. This change parenthesizes the options to avoid that problem, and makes provision for an explicit (and perhaps non-Boolean) value for each option. The original syntax is still supported, but only for the two original options ANALYZE and VERBOSE. As a test case, add a COSTS option that can suppress the planner cost estimates. This may be useful for including EXPLAIN output in the regression tests, which are otherwise unable to cope with cross-platform variations in cost estimates. Robert Haas	2009-07-26 23:34:18 +00:00
Peter Eisentraut	de160e2c00	Make backend header files C++ safe This alters various incidental uses of C++ key words to use other similar identifiers, so that a C++ compiler won't choke outright. You still (probably) need extern "C" { }; around the inclusion of backend headers. based on a patch by Kurt Harriman <harriman@acm.org> Also add a script cpluspluscheck to check for C++ compatibility in the future. As of right now, this passes without error for me.	2009-07-16 06:33:46 +00:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Tom Lane	090173a3f9	Remove the recently added node types ReloptElem and OptionDefElem in favor of adding optional namespace and action fields to DefElem. Having three node types that do essentially the same thing bloats the code and leads to errors of confusion, such as in yesterday's bug report from Khee Chin.	2009-04-04 21:12:31 +00:00
Alvaro Herrera	3a5b773715	Allow reloption names to have qualifiers, initially supporting a TOAST qualifier, and add support for this in pg_dump. This allows TOAST tables to have user-defined fillfactor, and will also enable us to move the autovacuum parameters to reloptions without taking away the possibility of setting values for TOAST tables.	2009-02-02 19:31:40 +00:00
Tom Lane	3cb5d6580a	Support column-level privileges, as required by SQL standard. Stephen Frost, with help from KaiGai Kohei and others	2009-01-22 20:16:10 +00:00
Heikki Linnakangas	c079090bbc	Update comments to reflect that tgenabled is not a boolean anymore. Jonah Harris, with minor tinkering by me.	2009-01-22 19:16:31 +00:00
Heikki Linnakangas	6587818542	Add vacuum_freeze_table_age GUC option, to control when VACUUM should ignore the visibility map and scan the whole table, to advance relfrozenxid.	2009-01-16 13:27:24 +00:00
Tom Lane	bbeb0bbf6b	Include a pointer to the query's source text in QueryDesc structs. This is practically free given prior 8.4 changes in plancache and portal management, and it makes it a lot easier for ExecutorStart/Run/End hooks to get at the query text. Extracted from Itagaki Takahiro's pg_stat_statements patch, with minor editorialization.	2009-01-02 20:42:00 +00:00
Bruce Momjian	511db38ace	Update copyright for 2009.	2009-01-01 17:24:05 +00:00
Peter Eisentraut	cae565e503	SQL/MED catalog manipulation facilities This doesn't do any remote or external things yet, but it gives modules like plproxy and dblink a standardized and future-proof system for managing their connection information. Martin Pihlak and Peter Eisentraut	2008-12-19 16:25:19 +00:00
Heikki Linnakangas	dcf8409985	Don't reset pg_class.reltuples and relpages in VACUUM, if any pages were skipped. We could update relpages anyway, but it seems better to only update it together with reltuples, because we use the reltuples/relpages ratio in the planner. Also don't update n_live_tuples in pgstat. ANALYZE in VACUUM ANALYZE now needs to update pg_class, if the VACUUM-phase didn't do so. Added some boolean-passing to let analyze_rel know if it should update pg_class or not. I also moved the relcache invalidation (to update rd_targblock) from vac_update_relstats to where RelationTruncate is called, because vac_update_relstats is not called for partial vacuums anymore. It's more obvious to send the invalidation close to the truncation that requires it. Per report by Ned T. Crigler.	2008-12-17 09:15:03 +00:00
Peter Eisentraut	455dffbb73	Default values for function arguments Pavel Stehule, with some tweaks by Peter Eisentraut	2008-12-04 17:51:28 +00:00
Tom Lane	cd35e9d746	Some infrastructure changes for the upcoming auto-explain contrib module: * Refactor explain.c slightly to export a convenient-to-use subroutine for printing EXPLAIN results. * Provide hooks for plugins to get control at ExecutorStart and ExecutorEnd as well as ExecutorRun. * Add some minimal support for tracking the total runtime of ExecutorRun. This code won't actually do anything unless a plugin prods it to. * Change the API of the DefineCustomXXXVariable functions to allow nonzero "flags" to be specified for a custom GUC variable. While at it, also make the "bootstrap" default value for custom GUCs be explicitly specified as a parameter to these functions. This is to eliminate confusion over where the default comes from, as has been expressed in the past by some users of the custom-variable facility. * Refactor GUC code a bit to ensure that a custom variable gets initialized to something valid (like its default value) even if the placeholder value was invalid.	2008-11-19 01:10:24 +00:00
Tom Lane	c5451c22e3	Make relhasrules and relhastriggers work like relhasindex, namely we let VACUUM reset them to false rather than trying to clean 'em up during DROP.	2008-11-10 00:49:37 +00:00
Tom Lane	6517f377d6	Implement ALTER DATABASE SET TABLESPACE to move a whole database (or at least as much of it as lives in its default tablespace) to a new tablespace. Guillaume Lelarge, with some help from Bernd Helmle and Tom Lane	2008-11-07 18:25:07 +00:00
Tom Lane	312b1a983f	Reduce the memory footprint of large pending-trigger-event lists, as per my recent proposal. In typical cases, we now need 12 bytes per insert or delete event and 16 bytes per update event; previously we needed 40 bytes per event on 32-bit hardware and 80 bytes per event on 64-bit hardware. Even in the worst case usage pattern with a large number of distinct triggers being fired in one query, usage is at most 32 bytes per event. It seems to be a bit faster than the old code as well, due to reduction of palloc overhead. This commit doesn't address the TODO item of allowing the event list to spill to disk; rather it's trying to stave off the need for that. However, it probably makes that task a bit easier by reducing the data structure's dependency on pointers. It would now be practical to dump an event list to disk by "chunks" instead of individual events.	2008-10-24 23:42:35 +00:00
Magnus Hagander	7626f2a936	Mark SessionReplicationRole as PGDLLIMPORT so it can be used from Slony functions. Per report from Hiroshi Saito.	2008-09-19 14:43:46 +00:00
Alvaro Herrera	3ccde312ec	Have autovacuum consider processing TOAST tables separately from their main tables. This requires vacuum() to accept processing a toast table standalone, so there's a user-visible change in that it's now possible (for a superuser) to execute "VACUUM pg_toast.pg_toast_XXX".	2008-08-13 00:07:50 +00:00

... 3 4 5 6 7 ...

838 Commits