postgresql

Commit Graph

Author	SHA1	Message	Date
Simon Riggs	4b2d44031f	MERGE post-commit review Review comments from Andres Freund * Consolidate code into AfterTriggerGetTransitionTable() * Rename nodeMerge.c to execMerge.c * Rename nodeMerge.h to execMerge.h * Move MERGE handling in ExecInitModifyTable() into a execMerge.c ExecInitMerge() * Move mt_merge_subcommands flags into execMerge.h * Rename opt_and_condition to opt_merge_when_and_condition * Wordsmith various comments Author: Pavan Deolasee Reviewer: Simon Riggs	2018-04-05 09:54:07 +01:00
Simon Riggs	d204ef6377	MERGE SQL Command following SQL:2016 MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows a task that would other require multiple PL statements. e.g. MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular and partitioned tables, including column and row security enforcement, as well as support for row, statement and transition triggers. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used statically from PL/pgSQL. MERGE does not yet support inheritance, write rules, RETURNING clauses, updatable views or foreign tables. MERGE follows SQL Standard per the most recent SQL:2016. Includes full tests and documentation, including full isolation tests to demonstrate the concurrent behavior. This version written from scratch in 2017 by Simon Riggs, using docs and tests originally written in 2009. Later work from Pavan Deolasee has been both complex and deep, leaving the lead author credit now in his hands. Extensive discussion of concurrency from Peter Geoghegan, with thanks for the time and effort contributed. Various issues reported via sqlsmith by Andreas Seltenreich Authors: Pavan Deolasee, Simon Riggs Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com	2018-04-03 09:28:16 +01:00
Simon Riggs	7cf8a5c302	Revert "Modified files for MERGE" This reverts commit `354f13855e`.	2018-04-02 21:34:15 +01:00
Simon Riggs	354f13855e	Modified files for MERGE	2018-04-02 21:12:47 +01:00
Andres Freund	d87510a524	Combine options for RangeVarGetRelidExtended() into a flags argument. A followup patch will add a SKIP_LOCKED option. To avoid introducing evermore arguments, breaking existing callers each time, introduce a flags argument. This'll no doubt break a few external users... Also change the MISSING_OK behaviour so a DEBUG1 debug message is emitted when a relation is not found. Author: Nathan Bossart Reviewed-By: Michael Paquier and Andres Freund Discussion: https://postgr.es/m/20180306005349.b65whmvj7z6hbe2y@alap3.anarazel.de	2018-03-30 17:05:16 -07:00
Alvaro Herrera	86f575948c	Allow FOR EACH ROW triggers on partitioned tables Previously, FOR EACH ROW triggers were not allowed in partitioned tables. Now we allow AFTER triggers on them, and on trigger creation we cascade to create an identical trigger in each partition. We also clone the triggers to each partition that is created or attached later. This means that deferred unique keys are allowed on partitioned tables, too. Author: Álvaro Herrera Reviewed-by: Peter Eisentraut, Simon Riggs, Amit Langote, Robert Haas, Thomas Munro Discussion: https://postgr.es/m/20171229225319.ajltgss2ojkfd3kp@alvherre.pgsql	2018-03-23 10:48:22 -03:00
Tom Lane	25b692568f	Prevent dangling-pointer access when update trigger returns old tuple. A before-update row trigger may choose to return the "new" or "old" tuple unmodified. ExecBRUpdateTriggers failed to consider the second possibility, and would proceed to free the "old" tuple even if it was the one returned, leading to subsequent access to already-deallocated memory. In debug builds this reliably leads to an "invalid memory alloc request size" failure; in production builds it might accidentally work, but data corruption is also possible. This is a very old bug. There are probably a couple of reasons it hasn't been noticed up to now. It would be more usual to return NULL if one wanted to suppress the update action; returning "old" is significantly less efficient since the update will occur anyway. Also, none of the standard PLs would ever cause this because they all returned freshly-manufactured tuples even if they were just copying "old". But commit `4b93f5799` changed that for plpgsql, making it possible to see the bug with a plpgsql trigger. Still, this is certainly legal behavior for a trigger function, so it's ExecBRUpdateTriggers's fault not plpgsql's. It seems worth creating a test case that exercises returning "old" directly with a C-language trigger; testing this through plpgsql seems unreliable because its behavior might change again. Report and fix by Rushabh Lathia; regression test case by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAGPqQf1P4pjiNPrMof=P_16E-DFjt457j+nH2ex3=nBTew7tXw@mail.gmail.com	2018-02-27 13:28:02 -05:00
Andres Freund	ad7dbee368	Allow tupleslots to have a fixed tupledesc, use in executor nodes. The reason for doing so is that it will allow expression evaluation to optimize based on the underlying tupledesc. In particular it will allow to JIT tuple deforming together with the expression itself. For that expression initialization needs to be moved after the relevant slots are initialized - mostly unproblematic, except in the case of nodeWorktablescan.c. After doing so there's no need for ExecAssignResultType() and ExecAssignResultTypeFromTL() anymore, as all former callers have been converted to create a slot with a fixed descriptor. When creating a slot with a fixed descriptor, tts_values/isnull can be allocated together with the main slot, reducing allocation overhead and increasing cache density a bit. Author: Andres Freund Discussion: https://postgr.es/m/20171206093717.vqdxe5icqttpxs3p@alap3.anarazel.de	2018-02-16 21:17:38 -08:00
Robert Haas	2f17844104	Allow UPDATE to move rows between partitions. When an UPDATE causes a row to no longer match the partition constraint, try to move it to a different partition where it does match the partition constraint. In essence, the UPDATE is split into a DELETE from the old partition and an INSERT into the new one. This can lead to surprising behavior in concurrency scenarios because EvalPlanQual rechecks won't work as they normally did; the known problems are documented. (There is a pending patch to improve the situation further, but it needs more review.) Amit Khandekar, reviewed and tested by Amit Langote, David Rowley, Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro Herrera, Amit Kapila, and me. A few final revisions by me. Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com	2018-01-19 15:33:06 -05:00
Peter Eisentraut	8b9e9644dc	Replace AclObjectKind with ObjectType AclObjectKind was basically just another enumeration for object types, and we already have a preferred one for that. It's only used in aclcheck_error. By using ObjectType instead, we can also give some more precise error messages, for example "index" instead of "relation". Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2018-01-19 14:01:15 -05:00
Bruce Momjian	9d4649ca49	Update copyright for 2018 Backpatch-through: certain files through 9.3	2018-01-02 23:30:12 -05:00
Peter Eisentraut	2eb4a831e5	Change TRUE/FALSE to true/false The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-11-08 11:37:28 -05:00
Tom Lane	5fa6b0d102	Remove unnecessary PG_TRY overhead for CurrentResourceOwner changes. resowner/README contained advice to use a PG_TRY block to restore the old CurrentResourceOwner value anywhere that that variable is transiently changed. That advice was only inconsistently followed, however, and on reflection it seems like unnecessary overhead. We don't bother with such a convention for transient CurrentMemoryContext changes, on the grounds that any (sub)transaction abort will start out by resetting CurrentMemoryContext to what it wants. But the same is true of CurrentResourceOwner, so there seems no need to treat it differently. Hence, remove PG_TRY blocks that exist only to restore CurrentResourceOwner before re-throwing the error. There are a couple of places that restore it along with some other actions, and I left those alone; the restore is probably unnecessary but no noticeable gain will result from removing it. Discussion: https://postgr.es/m/5236.1507583529@sss.pgh.pa.us	2017-10-11 17:44:09 -04:00
Tom Lane	27c6619e9c	Fix possible dangling pointer dereference in trigger.c. AfterTriggerEndQuery correctly notes that the query_stack could get repalloc'd during a trigger firing, but it nonetheless passes the address of a query_stack entry to afterTriggerInvokeEvents, so that if such a repalloc occurs, afterTriggerInvokeEvents is already working with an obsolete dangling pointer while it scans the rest of the events. Oops. The only code at risk is its "delete_ok" cleanup code, so we can prevent unsafe behavior by passing delete_ok = false instead of true. However, that could have a significant performance penalty, because the point of passing delete_ok = true is to not have to re-scan possibly a large number of dead trigger events on the next time through the loop. There's more than one way to skin that cat, though. What we can do is delete all the "chunks" in the event list except the last one, since we know all events in them must be dead. Deleting the chunks is work we'd have had to do later in AfterTriggerEndQuery anyway, and it ends up saving rescanning of just about the same events we'd have gotten rid of with delete_ok = true. In v10 and HEAD, we also have to be careful to mop up any per-table after_trig_events pointers that would become dangling. This is slightly annoying, but I don't think that normal use-cases will traverse this code path often enough for it to be a performance problem. It's pretty hard to hit this in practice because of the unlikelihood of the query_stack getting resized at just the wrong time. Nonetheless, it's definitely a live bug of ancient standing, so back-patch to all supported branches. Discussion: https://postgr.es/m/2891.1505419542@sss.pgh.pa.us	2017-09-17 14:50:01 -04:00
Tom Lane	fd31f9f033	Ensure that BEFORE STATEMENT triggers fire the right number of times. Commit `0f79440fb` introduced mechanism to keep AFTER STATEMENT triggers from firing more than once per statement, which was formerly possible if more than one FK enforcement action had to be applied to a given table. Add a similar mechanism for BEFORE STATEMENT triggers, so that we don't have the unexpected situation of firing BEFORE STATEMENT triggers more often than AFTER STATEMENT. As with the previous patch, back-patch to v10. Discussion: https://postgr.es/m/22315.1505584992@sss.pgh.pa.us	2017-09-17 12:16:38 -04:00
Tom Lane	0f79440fb0	Fix SQL-spec incompatibilities in new transition table feature. The standard says that all changes of the same kind (insert, update, or delete) caused in one table by a single SQL statement should be reported in a single transition table; and by that, they mean to include foreign key enforcement actions cascading from the statement's direct effects. It's also reasonable to conclude that if the standard had wCTEs, they would say that effects of wCTEs applying to the same table as each other or the outer statement should be merged into one transition table. We weren't doing it like that. Hence, arrange to merge tuples from multiple update actions into a single transition table as much as we can. There is a problem, which is that if the firing of FK enforcement triggers and after-row triggers with transition tables is interspersed, we might need to report more tuples after some triggers have already seen the transition table. It seems like a bad idea for the transition table to be mutable between trigger calls. There's no good way around this without a major redesign of the FK logic, so for now, resolve it by opening a new transition table each time this happens. Also, ensure that AFTER STATEMENT triggers fire just once per statement, or once per transition table when we're forced to make more than one. Previous versions of Postgres have allowed each FK enforcement query to cause an additional firing of the AFTER STATEMENT triggers for the referencing table, but that's certainly not per spec. (We're still doing multiple firings of BEFORE STATEMENT triggers, though; is that something worth changing?) Also, forbid using transition tables with column-specific UPDATE triggers. The spec requires such transition tables to show only the tuples for which the UPDATE trigger would have fired, which means maintaining multiple transition tables or else somehow filtering the contents at readout. Maybe someday we'll bother to support that option, but it looks like a lot of trouble for a marginal feature. The transition tables are now managed by the AfterTriggers data structures, rather than being directly the responsibility of ModifyTable nodes. This removes a subtransaction-lifespan memory leak introduced by my previous band-aid patch `3c4359521`. In passing, refactor the AfterTriggers data structures to reduce the management overhead for them, by using arrays of structs rather than several parallel arrays for per-query-level and per-subtransaction state. I failed to resist the temptation to do some copy-editing on the SGML docs about triggers, above and beyond merely documenting the effects of this patch. Back-patch to v10, because we don't want the semantics of transition tables to change post-release. Patch by me, with help and review from Thomas Munro. Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org	2017-09-16 13:20:36 -04:00
Peter Eisentraut	821fb8cdbf	Message style fixes	2017-09-11 11:21:27 -04:00
Tom Lane	3c43595217	Quick-hack fix for foreign key cascade vs triggers with transition tables. AFTER triggers using transition tables crashed if they were fired due to a foreign key ON CASCADE update. This is because ExecEndModifyTable flushes the transition tables, on the assumption that any trigger that could need them was already fired during ExecutorFinish. Normally that's true, because we don't allow transition-table-using triggers to be deferred. However, foreign key CASCADE updates force any triggers on the referencing table to be deferred to the outer query level, by means of the EXEC_FLAG_SKIP_TRIGGERS flag. I don't recall all the details of why it's like that and am pretty loath to redesign it right now. Instead, just teach ExecEndModifyTable to skip destroying the TransitionCaptureState when that flag is set. This will allow the transition table data to survive until end of the current subtransaction. This isn't a terribly satisfactory solution, because (1) we might be leaking the transition tables for much longer than really necessary, and (2) as things stand, an AFTER STATEMENT trigger will fire once per RI updating query, ie once per row updated or deleted in the referenced table. I suspect that is not per SQL spec. But redesigning this is a research project that we're certainly not going to get done for v10. So let's go with this hackish answer for now. In passing, tweak AfterTriggerSaveEvent to not save the transition_capture pointer into the event record for a deferrable trigger. This is not necessary to fix the current bug, but it avoids letting dangling pointers to long-gone transition tables persist in the trigger event queue. That's at least a safety feature. It might also allow merging shared trigger states in more cases than before. I added a regression test that demonstrates the crash on unpatched code, and also exposes the behavior of firing the AFTER STATEMENT triggers once per row update. Per bug #14808 from Philippe Beaudoin. Back-patch to v10. Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org	2017-09-10 14:59:56 -04:00
Tom Lane	21d304dfed	Final pgindent + perltidy run for v10.	2017-08-14 17:29:33 -04:00
Andrew Gierth	8c55244ae3	Fix transition tables for ON CONFLICT. We now disallow having triggers with both transition tables and ON INSERT OR UPDATE (which was a PG extension to the spec anyway), because in this case it's not at all clear how the transition tables should work for an INSERT ... ON CONFLICT query. Separate ON INSERT and ON UPDATE triggers with transition tables are allowed, and the transition tables for these reflect only the inserted and only the updated tuples respectively. Patch by Thomas Munro Discussion: https://postgr.es/m/CAEepm%3D11KHQ0JmETJQihSvhZB5mUZL2xrqHeXbCeLhDiqQ39%3Dw%40mail.gmail.com	2017-06-28 19:00:55 +01:00
Andrew Gierth	c46c0e5202	Fix transition tables for wCTEs. The original coding didn't handle this case properly; each separate DML substatement needs its own set of transitions. Patch by Thomas Munro Discussion: https://postgr.es/m/CAL9smLCDQ%3D2o024rBgtD4WihzX8B3C6u_oSQ2K3%2BR5grJrV0bg%40mail.gmail.com	2017-06-28 18:59:01 +01:00
Andrew Gierth	501ed02cf6	Fix transition tables for partition/inheritance. We disallow row-level triggers with transition tables on child tables. Transition tables for triggers on the parent table contain only those columns present in the parent. (We can't mix tuple formats in a single transition table.) Patch by Thomas Munro Discussion: https://postgr.es/m/CA%2BTgmoZzTBBAsEUh4MazAN7ga%3D8SsMC-Knp-6cetts9yNZUCcg%40mail.gmail.com	2017-06-28 18:55:03 +01:00
Tom Lane	382ceffdf7	Phase 3 of pgindent updates. Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:35:54 -04:00
Tom Lane	c7b8998ebb	Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit `e3860ffa4d` wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:19:25 -04:00
Tom Lane	e3860ffa4d	Initial pgindent run with pg_bsd_indent version 2.0. The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname varname", as well as some similar cases for const- or volatile-qualified pointers. Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 14:39:04 -04:00
Bruce Momjian	a6fd7b7a5f	Post-PG 10 beta1 pgindent run perltidy run not included.	2017-05-17 16:31:56 -04:00
Robert Haas	59f40566ca	Fix relcache leak when row triggers on partitions are fired by COPY. Thomas Munro, reviewed by Amit Langote Discussion: http://postgr.es/m/CAEepm=15Jss-yhFApuKzxcoCuFnb8TR8iQiWMjG=CLYPx48QLw@mail.gmail.com	2017-05-16 12:46:32 -04:00
Robert Haas	9e6104c667	Prohibit transition tables on views and foreign tables. Thomas Munro, per off-list report from Prabhat Sabu. Changes to the message wording for consistency with the existing relkind check for partitioned tables by me. Discussion: http://postgr.es/m/CAEepm=2xJFFpGM+N=gpWx-9Nft2q1oaFZX07_y23AHCrJQLt0g@mail.gmail.com	2017-05-09 23:34:02 -04:00
Robert Haas	29fd3d9da0	Don't permit transition tables with TRUNCATE triggers. Prior to this prohibition, such a trigger caused a crash. Thomas Munro, per a report from Neha Sharma. I added a regression test. Discussion: http://postgr.es/m/CAEepm=0VR5W-N38eTkO_FqJbGqQ_ykbBRmzmvHyxDhy1p=0Csw@mail.gmail.com	2017-05-09 23:24:23 -04:00
Tom Lane	8f0530f580	Improve castNode notation by introducing list-extraction-specific variants. This extends the castNode() notation introduced by commit `5bcab1114` to provide, in one step, extraction of a list cell's pointer and coercion to a concrete node type. For example, "lfirst_node(Foo, lc)" is the same as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode that have appeared so far include a list extraction call, so this is pretty widely useful, and it saves a few more keystrokes compared to the old way. As with the previous patch, back-patch the addition of these macros to pg_list.h, so that the notation will be available when back-patching. Patch by me, after an idea of Andrew Gierth's. Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us	2017-04-10 13:51:53 -04:00
Kevin Grittner	18ce3a4ab2	Add infrastructure to support EphemeralNamedRelation references. A QueryEnvironment concept is added, which allows new types of objects to be passed into queries from parsing on through execution. At this point, the only thing implemented is a collection of EphemeralNamedRelation objects -- relations which can be referenced by name in queries, but do not exist in the catalogs. The only type of ENR implemented is NamedTuplestore, but provision is made to add more types fairly easily. An ENR can carry its own TupleDesc or reference a relation in the catalogs by relid. Although these features can be used without SPI, convenience functions are added to SPI so that ENRs can easily be used by code run through SPI. The initial use of all this is going to be transition tables in AFTER triggers, but that will be added to each PL as a separate commit. An incidental effect of this patch is to produce a more informative error message if an attempt is made to modify the contents of a CTE from a referencing DML statement. No tests previously covered that possibility, so one is added. Kevin Grittner and Thomas Munro Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro with valuable comments and suggestions from many others	2017-03-31 23:17:18 -05:00
Andres Freund	b8d7f053c5	Faster expression evaluation and targetlist projection. This replaces the old, recursive tree-walk based evaluation, with non-recursive, opcode dispatch based, expression evaluation. Projection is now implemented as part of expression evaluation. This both leads to significant performance improvements, and makes future just-in-time compilation of expressions easier. The speed gains primarily come from: - non-recursive implementation reduces stack usage / overhead - simple sub-expressions are implemented with a single jump, without function calls - sharing some state between different sub-expressions - reduced amount of indirect/hard to predict memory accesses by laying out operation metadata sequentially; including the avoidance of nearly all of the previously used linked lists - more code has been moved to expression initialization, avoiding constant re-checks at evaluation time Future just-in-time compilation (JIT) has become easier, as demonstrated by released patches intended to be merged in a later release, for primarily two reasons: Firstly, due to a stricter split between expression initialization and evaluation, less code has to be handled by the JIT. Secondly, due to the non-recursive nature of the generated "instructions", less performance-critical code-paths can easily be shared between interpreted and compiled evaluation. The new framework allows for significant future optimizations. E.g.: - basic infrastructure for to later reduce the per executor-startup overhead of expression evaluation, by caching state in prepared statements. That'd be helpful in OLTPish scenarios where initialization overhead is measurable. - optimizing the generated "code". A number of proposals for potential work has already been made. - optimizing the interpreter. Similarly a number of proposals have been made here too. The move of logic into the expression initialization step leads to some backward-incompatible changes: - Function permission checks are now done during expression initialization, whereas previously they were done during execution. In edge cases this can lead to errors being raised that previously wouldn't have been, e.g. a NULL array being coerced to a different array type previously didn't perform checks. - The set of domain constraints to be checked, is now evaluated once during expression initialization, previously it was re-built every time a domain check was evaluated. For normal queries this doesn't change much, but e.g. for plpgsql functions, which caches ExprStates, the old set could stick around longer. The behavior around might still change. Author: Andres Freund, with significant changes by Tom Lane, changes by Heikki Linnakangas Reviewed-By: Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de	2017-03-25 14:52:06 -07:00
Noah Misch	3a0d473192	Use wrappers of PG_DETOAST_DATUM_PACKED() more. This makes almost all core code follow the policy introduced in the previous commit. Specific decisions: - Text search support functions with char* and length arguments, such as prsstart and lexize, may receive unaligned strings. I doubt maintainers of non-core text search code will notice. - Use plain VARDATA() on values detoasted or synthesized earlier in the same function. Use VARDATA_ANY() on varlenas sourced outside the function, even if they happen to always have four-byte headers. As an exception, retain the universal practice of using VARDATA() on return values of SendFunctionCall(). - Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large for a one-byte header, so this misses no optimization.) Sites that do not call get_page_from_raw() typically need the four-byte alignment. - For now, do not change btree_gist. Its use of four-byte headers in memory is partly entangled with storage of 4-byte headers inside GBT_VARKEY, on disk. - For now, do not change gtrgm_consistent() or gtrgm_distance(). They incorporate the varlena header into a cache, and there are multiple credible implementation strategies to consider.	2017-03-12 19:35:34 -04:00
Tom Lane	ab02896510	Provide CatalogTupleDelete() as a wrapper around simple_heap_delete(). This extends the work done in commit `2f5c9d9c9` to provide a more nearly complete abstraction layer hiding the details of index updating for catalog changes. That commit only invented abstractions for catalog inserts and updates, leaving nearby code for catalog deletes still calling the heap-level routines directly. That seems rather ugly from here, and it does little to help if we ever want to shift to a storage system in which indexing work is needed at delete time. Hence, create a wrapper function CatalogTupleDelete(), and replace calls of simple_heap_delete() on catalog tuples with it. There are now very few direct calls of [simple_]heap_delete remaining in the tree. Discussion: https://postgr.es/m/462.1485902736@sss.pgh.pa.us	2017-02-01 16:13:30 -05:00
Alvaro Herrera	2f5c9d9c9c	Tweak catalog indexing abstraction for upcoming WARM Split the existing CatalogUpdateIndexes into two different routines, CatalogTupleInsert and CatalogTupleUpdate, which do both the heap insert/update plus the index update. This removes over 300 lines of boilerplate code all over src/backend/catalog/ and src/backend/commands. The resulting code is much more pleasing to the eye. Also, by encapsulating what happens in detail during an UPDATE, this facilitates the upcoming WARM patch, which is going to add a few more lines to the update case making the boilerplate even more boring. The original CatalogUpdateIndexes is removed; there was only one use left, and since it's just three lines, we can as well expand it in place there. We could keep it, but WARM is going to break all the UPDATE out-of-core callsites anyway, so there seems to be no benefit in doing so. Author: Pavan Deolasee Discussion: https://www.postgr.es/m/CABOikdOcFYSZ4vA2gYfs=M2cdXzXX4qGHeEiW3fu9PCfkHLa2A@mail.gmail.com	2017-01-31 18:42:24 -03:00
Andres Freund	9ba8a9ce45	Use the new castNode() macro in a number of places. This is far from a pervasive conversion, but it's a good starting point. Author: Peter Eisentraut, with some minor changes by me Reviewed-By: Tom Lane Discussion: https://postgr.es/m/c5d387d9-3440-f5e0-f9d4-71d53b9fbe52@2ndquadrant.com	2017-01-26 16:47:03 -08:00
Robert Haas	27cdb3414b	Reindent table partitioning code. We've accumulated quite a bit of stuff with which pgindent is not quite happy in this code; clean it up to provide a less-annoying base for future pgindent runs.	2017-01-24 10:20:02 -05:00
Alvaro Herrera	9a34123bc3	Make messages mentioning type names more uniform This avoids additional translatable strings for each distinct type, as well as making our quoting style around type names more consistent (namely, that we don't quote type names). This continues what started as `f402b99501`. Discussion: https://postgr.es/m/20160401170642.GA57509@alvherre.pgsql	2017-01-18 16:08:20 -03:00
Tom Lane	ab1f0c8225	Change representation of statement lists, and add statement location info. This patch makes several changes that improve the consistency of representation of lists of statements. It's always been the case that the output of parse analysis is a list of Query nodes, whatever the types of the individual statements in the list. This patch brings similar consistency to the outputs of raw parsing and planning steps: * The output of raw parsing is now always a list of RawStmt nodes; the statement-type-dependent nodes are one level down from that. * The output of pg_plan_queries() is now always a list of PlannedStmt nodes, even for utility statements. In the case of a utility statement, "planning" just consists of wrapping a CMD_UTILITY PlannedStmt around the utility node. This list representation is now used in Portal and CachedPlan plan lists, replacing the former convention of intermixing PlannedStmts with bare utility-statement nodes. Now, every list of statements has a consistent head-node type depending on how far along it is in processing. This allows changing many places that formerly used generic "Node *" pointers to use a more specific pointer type, thus reducing the number of IsA() tests and casts needed, as well as improving code clarity. Also, the post-parse-analysis representation of DECLARE CURSOR is changed so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained SELECT remains a child of the DeclareCursorStmt rather than getting flipped around to be the other way. It's now true for both Query and PlannedStmt that utilityStmt is non-null if and only if commandType is CMD_UTILITY. That allows simplifying a lot of places that were testing both fields. (I think some of those were just defensive programming, but in many places, it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.) Because PlannedStmt carries a canSetTag field, we're also able to get rid of some ad-hoc rules about how to reconstruct canSetTag for a bare utility statement; specifically, the assumption that a utility is canSetTag if and only if it's the only one in its list. While I see no near-term need for relaxing that restriction, it's nice to get rid of the ad-hocery. The API of ProcessUtility() is changed so that what it's passed is the wrapper PlannedStmt not just the bare utility statement. This will affect all users of ProcessUtility_hook, but the changes are pretty trivial; see the affected contrib modules for examples of the minimum change needed. (Most compilers should give pointer-type-mismatch warnings for uncorrected code.) There's also a change in the API of ExplainOneQuery_hook, to pass through cursorOptions instead of expecting hook functions to know what to pick. This is needed because of the DECLARE CURSOR changes, but really should have been done in 9.6; it's unlikely that any extant hook functions know about using CURSOR_OPT_PARALLEL_OK. Finally, teach gram.y to save statement boundary locations in RawStmt nodes, and pass those through to Query and PlannedStmt nodes. This allows more intelligent handling of cases where a source query string contains multiple statements. This patch doesn't actually do anything with the information, but a follow-on patch will. (Passing this information through cleanly is the true motivation for these changes; while I think this is all good cleanup, it's unlikely we'd have bothered without this end goal.) catversion bump because addition of location fields to struct Query affects stored rules. This patch is by me, but it owes a good deal to Fabien Coelho who did a lot of preliminary work on the problem, and also reviewed the patch. Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre	2017-01-14 16:02:35 -05:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Robert Haas	f0e44751d7	Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.	2016-12-07 13:17:55 -05:00
Kevin Grittner	8c48375e5f	Implement syntax for transition tables in AFTER triggers. This is infrastructure for the complete SQL standard feature. No support is included at this point for execution nodes or PLs. The intent is to add that soon. As this patch leaves things, standard syntax can create tuplestores to contain old and/or new versions of rows affected by a statement. References to these tuplestores are in the TriggerData structure. C triggers can access the tuplestores directly, so they are usable, but they cannot yet be referenced within a SQL statement.	2016-11-04 10:49:50 -05:00
Tom Lane	ea268cdc9a	Add macros to make AllocSetContextCreate() calls simpler and safer. I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>	2016-08-27 17:50:38 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Kevin Grittner	a343e223a5	Revert no-op changes to BufferGetPage() The reverted changes were intended to force a choice of whether any newly-added BufferGetPage() calls needed to be accompanied by a test of the snapshot age, to support the "snapshot too old" feature. Such an accompanying test is needed in about 7% of the cases, where the page is being used as part of a scan rather than positioning for other purposes (such as DML or vacuuming). The additional effort required for back-patching, and the doubt whether the intended benefit would really be there, have indicated it is best just to rely on developers to do the right thing based on comments and existing usage, as we do with many other conventions. This change should have little or no effect on generated executable code. Motivated by the back-patching pain of Tom Lane and Robert Haas	2016-04-20 08:31:19 -05:00
Kevin Grittner	8b65cf4c5e	Modify BufferGetPage() to prepare for "snapshot too old" feature This patch is a no-op patch which is intended to reduce the chances of failures of omission once the functional part of the "snapshot too old" patch goes in. It adds parameters for snapshot, relation, and an enum to specify whether the snapshot age check needs to be done for the page at this point. This initial patch passes NULL for the first two new parameters and BGP_NO_SNAPSHOT_TEST for the third. The follow-on patch will change the places where the test needs to be made.	2016-04-08 14:30:10 -05:00
Teodor Sigaev	8b99edefca	Revert CREATE INDEX ... INCLUDING ... It's not ready yet, revert two commits `690c543550` - unstable test output `386e3d7609` - patch itself	2016-04-08 21:52:13 +03:00
Teodor Sigaev	386e3d7609	CREATE INDEX ... INCLUDING (column[, ...]) Now indexes (but only B-tree for now) can contain "extra" column(s) which doesn't participate in index structure, they are just stored in leaf tuples. It allows to use index only scan by using single index instead of two or more indexes. Author: Anastasia Lubennikova with minor editorializing by me Reviewers: David Rowley, Peter Geoghegan, Jeff Janes	2016-04-08 19:45:59 +03:00
Alvaro Herrera	f402b99501	Type names should not be quoted Our actual convention, contrary to what I said in `59a2111b23`, is not to quote type names, as evidenced by unquoted use of format_type_be() result value in error messages. Remove quotes from recently tweaked messages accordingly. Per note from Tom Lane	2016-04-01 13:35:48 -03:00
Alvaro Herrera	59a2111b23	Improve internationalization of messages involving type names Change the slightly different variations of the message function FOO must return type BAR to a single wording, removing the variability in type name so that they all create a single translation entry; since the type name is not to be translated, there's no point in it being part of the message anyway. Also, change them all to use the same quoting convention, namely that the function name is not to be quoted but the type name is. (I'm not quite sure why this is so, but it's the clear majority.) Some similar messages such as "encoding conversion function FOO must ..." are also changed.	2016-03-28 14:24:37 -03:00
Simon Riggs	8320c625d9	Change comment to describe correct lock level used	2016-03-23 11:32:34 +00:00
Tom Lane	364a9f47ab	Refactor pull_var_clause's API to make it less tedious to extend. In commit `1d97c19a0f` and later `c1d9579dd8`, we extended pull_var_clause's API by adding enum-type arguments. That's sort of a pain to maintain, though, because it means every time we add a new behavior we must touch every last one of the call sites, even if there's a reasonable default behavior that most of them could use. Let's switch over to using a bitmask of flags, instead; that seems more maintainable and might save a nanosecond or two as well. This commit changes no behavior in itself, though I'm going to follow it up with one that does add a new behavior. In passing, remove flatten_tlist(), which has not been used since 9.1 and would otherwise need the same API changes. Removing these enums means that optimizer/tlist.h no longer needs to depend on optimizer/var.h. Changing that caused a number of C files to need addition of #include "optimizer/var.h" (probably we can thank old runs of pgrminclude for that); but on balance it seems like a good change anyway.	2016-03-10 15:53:07 -05:00
Tom Lane	72eee410d4	Move pg_constraint.h function declarations to new file pg_constraint_fn.h. A pending patch requires exporting a function returning Bitmapset from catalog/pg_constraint.c. As things stand, that would mean including nodes/bitmapset.h in pg_constraint.h, which might be hazardous for the client-side includability of that header. It's not entirely clear whether any client-side code needs to include pg_constraint.h, but it seems prudent to assume that there is some such code somewhere. Therefore, split off the function definitions into a new file pg_constraint_fn.h, similarly to what we've done for some other catalog header files.	2016-02-11 15:51:28 -05:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Kevin Grittner	5956b7f9e8	Fix typo in C comment. Merlin Moncure Backpatch to 9.5, where the misspelling was introduced	2015-08-23 10:38:57 -05:00
Andres Freund	e95126cf04	Don't use function definitions looking like old-style ones. This fixes a bunch of somewhat pedantic warnings with new compilers. Since by far the majority of other functions definitions use the (void) style it just seems to be consistent to do so as well in the remaining few places.	2015-08-15 17:25:00 +02:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Andres Freund	e8898e9169	Minor ON CONFLICT related comments and doc fixes. Geoff Winkless, Stephen Frost, Peter Geoghegan and me.	2015-05-08 19:24:14 +02:00
Andres Freund	168d5805e4	Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.	2015-05-08 05:43:10 +02:00
Andres Freund	2c8f4836db	Represent columns requiring insert and update privileges indentently. Previously, relation range table entries used a single Bitmapset field representing which columns required either UPDATE or INSERT privileges, despite the fact that INSERT and UPDATE privileges are separately cataloged, and may be independently held. As statements so far required either insert or update privileges but never both, that was sufficient. The required permission could be inferred from the top level statement run. The upcoming INSERT ... ON CONFLICT UPDATE feature needs to independently check for both privileges in one statement though, so that is not sufficient anymore. Bumps catversion as stored rules change. Author: Peter Geoghegan Reviewed-By: Andres Freund	2015-05-08 00:20:46 +02:00
Simon Riggs	0ef0396ae1	Reduce lock levels of some trigger DDL and add FKs Reduce lock levels to ShareRowExclusive for the following SQL CREATE TRIGGER (but not DROP or ALTER) ALTER TABLE ENABLE TRIGGER ALTER TABLE DISABLE TRIGGER ALTER TABLE … ADD CONSTRAINT FOREIGN KEY Original work by Simon Riggs, extracted and refreshed by Andreas Karlsson New test cases added by Andreas Karlsson Reviewed by Noah Misch, Andres Freund, Michael Paquier and Simon Riggs	2015-04-05 11:37:08 -04:00
Alvaro Herrera	a2e35b53c3	Change many routines to return ObjectAddress rather than OID The changed routines are mostly those that can be directly called by ProcessUtilitySlow; the intention is to make the affected object information more precise, in support for future event trigger changes. Originally it was envisioned that the OID of the affected object would be enough, and in most cases that is correct, but upon actually implementing the event trigger changes it turned out that ObjectAddress is more widely useful. Additionally, some command execution routines grew an output argument that's an object address which provides further info about the executed command. To wit: * for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of the new constraint * for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the schema that originally contained the object. * for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address of the object added to or dropped from the extension. There's no user-visible change in this commit, and no functional change either. Discussion: 20150218213255.GC6717@tamriel.snowman.net Reviewed-By: Stephen Frost, Andres Freund	2015-03-03 14:10:50 -03:00
Tom Lane	33a3b03d63	Use FLEXIBLE_ARRAY_MEMBER in some more places. Fix a batch of structs that are only visible within individual .c files. Michael Paquier	2015-02-20 17:32:01 -05:00
Stephen Frost	804b6b6db4	Fix column-privilege leak in error-message paths While building error messages to return to the user, BuildIndexValueDescription, ExecBuildSlotValueDescription and ri_ReportViolation would happily include the entire key or entire row in the result returned to the user, even if the user didn't have access to view all of the columns being included. Instead, include only those columns which the user is providing or which the user has select rights on. If the user does not have any rights to view the table or any of the columns involved then no detail is provided and a NULL value is returned from BuildIndexValueDescription and ExecBuildSlotValueDescription. Note that, for key cases, the user must have access to all of the columns for the key to be shown; a partial key will not be returned. Further, in master only, do not return any data for cases where row security is enabled on the relation and row security should be applied for the user. This required a bit of refactoring and moving of things around related to RLS- note the addition of utils/misc/rls.c. Back-patch all the way, as column-level privileges are now in all supported versions. This has been assigned CVE-2014-8161, but since the issue and the patch have already been publicized on pgsql-hackers, there's no point in trying to hide this commit.	2015-01-28 12:31:30 -05:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Robert Haas	c8df9477f8	Fix potential NULL-pointer dereference. Commit `2781b4bea7` arranged to defer the setup of after-trigger-related data structures, but AfterTriggerPendingOnRel didn't get the memo.	2014-11-10 15:22:46 -05:00
Robert Haas	85bb81de53	Fix off-by-one error in `2781b4bea7`. Spotted by Tom Lane.	2014-10-24 08:18:28 -04:00
Robert Haas	2781b4bea7	Perform less setup work for AFTER triggers at transaction start. Testing reveals that the memory allocation we do at transaction start has small but measurable overhead on simple transactions. To cut down on that overhead, defer some of that work to the point when AFTER triggers are first used, thus avoiding it altogether if they never are. Patch by me. Review by Andres Freund.	2014-10-23 12:33:02 -04:00
Alvaro Herrera	df630b0dd5	Implement SKIP LOCKED for row-level locks This clause changes the behavior of SELECT locking clauses in the presence of locked rows: instead of causing a process to block waiting for the locks held by other processes (or raise an error, with NOWAIT), SKIP LOCKED makes the new reader skip over such rows. While this is not appropriate behavior for general purposes, there are some cases in which it is useful, such as queue-like tables. Catalog version bumped because this patch changes the representation of stored rules. Reviewed by Craig Ringer (based on a previous attempt at an implementation by Simon Riggs, who also provided input on the syntax used in the current patch), David Rowley, and Álvaro Herrera. Author: Thomas Munro	2014-10-07 17:23:34 -03:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Noah Misch	7cbe57c34d	Offer triggers on foreign tables. This covers all the SQL-standard trigger types supported for regular tables; it does not cover constraint triggers. The approach for acquiring the old row mirrors that for view INSTEAD OF triggers. For AFTER ROW triggers, we spool the foreign tuples to a tuplestore. This changes the FDW API contract; when deciding which columns to populate in the slot returned from data modification callbacks, writable FDWs will need to check for AFTER ROW triggers in addition to checking for a RETURNING clause. In support of the feature addition, refactor the TriggerFlags bits and the assembly of old tuples in ModifyTable. Ronan Dunklau, reviewed by KaiGai Kohei; some additional hacking by me.	2014-03-23 02:16:34 -04:00
Noah Misch	6115480c54	Improve comments about AfterTriggerBeginQuery() query level usage.	2014-03-23 02:15:52 -04:00
Robert Haas	5f173040e3	Avoid repeated name lookups during table and index DDL. If the name lookups come to different conclusions due to concurrent activity, we might perform some parts of the DDL on a different table than other parts. At least in the case of CREATE INDEX, this can be used to cause the permissions checks to be performed against a different table than the index creation, allowing for a privilege escalation attack. This changes the calling convention for DefineIndex, CreateTrigger, transformIndexStmt, transformAlterTableStmt, CheckIndexCompatible (in 9.2 and newer), and AlterTable (in 9.1 and older). In addition, CheckRelationOwnership is removed in 9.2 and newer and the calling convention is changed in older branches. A field has also been added to the Constraint node (FkConstraint in 8.4). Third-party code calling these functions or using the Constraint node will require updating. Report by Andres Freund. Patch by Robert Haas and Andres Freund, reviewed by Tom Lane. Security: CVE-2014-0062	2014-02-17 09:33:31 -05:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Robert Haas	e55704d8b2	Add new wal_level, logical, sufficient for logical decoding. When wal_level=logical, we'll log columns from the old tuple as configured by the REPLICA IDENTITY facility added in commit `07cacba983`. This makes it possible a properly-configured logical replication solution to correctly follow table updates even if they change the chosen key columns, or, with REPLICA IDENTITY FULL, even if the table has no key at all. Note that updates which do not modify the replica identity column won't log anything extra, making the choice of a good key (i.e. one that will rarely be changed) important to performance when wal_level=logical is configured. Each insert, update, or delete to a catalog table will also log the CMIN and/or CMAX values of stamped by the current transaction. This is necessary because logical decoding will require access to historical snapshots of the catalog in order to decode some data types, and the CMIN/CMAX values that we may need in order to judge row visibility may have been overwritten by the time we need them. Andres Freund, reviewed in various versions by myself, Heikki Linnakangas, KONDO Mitsumasa, and many others.	2013-12-10 19:01:40 -05:00
Robert Haas	8e18d04d4d	Refine our definition of what constitutes a system relation. Although user-defined relations can't be directly created in pg_catalog, it's possible for them to end up there, because you can create them in some other schema and then use ALTER TABLE .. SET SCHEMA to move them there. Previously, such relations couldn't afterwards be manipulated, because IsSystemRelation()/IsSystemClass() rejected all attempts to modify objects in the pg_catalog schema, regardless of their origin. With this patch, they now reject only those objects in pg_catalog which were created at initdb-time, allowing most operations on user-created tables in pg_catalog to proceed normally. This patch also adds new functions IsCatalogRelation() and IsCatalogClass(), which is similar to IsSystemRelation() and IsSystemClass() but with a slightly narrower definition: only TOAST tables of system catalogs are included, rather than all TOAST tables. This is currently used only for making decisions about when invalidation messages need to be sent, but upcoming logical decoding patches will find other uses for this information. Andres Freund, with some modifications by me.	2013-11-28 20:57:20 -05:00
Robert Haas	568d4138c6	Use an MVCC snapshot, rather than SnapshotNow, for catalog scans. SnapshotNow scans have the undesirable property that, in the face of concurrent updates, the scan can fail to see either the old or the new versions of the row. In many cases, we work around this by requiring DDL operations to hold AccessExclusiveLock on the object being modified; in some cases, the existing locking is inadequate and random failures occur as a result. This commit doesn't change anything related to locking, but will hopefully pave the way to allowing lock strength reductions in the future. The major issue has held us back from making this change in the past is that taking an MVCC snapshot is significantly more expensive than using a static special snapshot such as SnapshotNow. However, testing of various worst-case scenarios reveals that this problem is not severe except under fairly extreme workloads. To mitigate those problems, we avoid retaking the MVCC snapshot for each new scan; instead, we take a new snapshot only when invalidation messages have been processed. The catcache machinery already requires that invalidation messages be sent before releasing the related heavyweight lock; else other backends might rely on locally-cached data rather than scanning the catalog at all. Thus, making snapshot reuse dependent on the same guarantees shouldn't break anything that wasn't already subtly broken. Patch by me. Review by Michael Paquier and Andres Freund.	2013-07-02 09:47:01 -04:00
Stephen Frost	551938ae22	Post-pgindent cleanup Make slightly better decisions about indentation than what pgindent is capable of. Mostly breaking out long function calls into one line per argument, with a few other minor adjustments. No functional changes- all whitespace. pgindent ran cleanly (didn't change anything) after. Passes all regressions.	2013-06-01 09:38:15 -04:00
Bruce Momjian	9af4159fce	pgindent run for release 9.3 This is the first run of the Perl-based pgindent script. Also update pgindent instructions.	2013-05-29 16:58:43 -04:00
Tom Lane	f8db76e875	Editorialize a bit on new ProcessUtility() API. Choose a saner ordering of parameters (adding a new input param after the output params seemed a bit random), update the function's header comment to match reality (cmon folks, is this really that hard?), get rid of useless and sloppily-defined distinction between PROCESS_UTILITY_SUBCOMMAND and PROCESS_UTILITY_GENERATED.	2013-04-28 00:18:45 -04:00
Robert Haas	05f3f9c7b2	Extend object-access hook machinery to support post-alter events. This also slightly widens the scope of what we support in terms of post-create events. KaiGai Kohei, with a few changes, mostly to the comments, by me	2013-03-17 22:57:26 -04:00
Robert Haas	f90cc26982	Code beautification for object-access hook machinery. KaiGai Kohei	2013-03-06 20:53:25 -05:00
Bruce Momjian	7e2322dff3	Allow CREATE TABLE IF EXIST so succeed if the schema is nonexistent Previously, CREATE TABLE IF EXIST threw an error if the schema was nonexistent. This was done by passing 'missing_ok' to the function that looks up the schema oid.	2013-01-26 13:24:50 -05:00
Alvaro Herrera	0ac5ad5134	Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov	2013-01-23 12:04:59 -03:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Robert Haas	c504513f83	Adjust many backend functions to return OID rather than void. Extracted from a larger patch by Dimitri Fontaine. It is hoped that this will provide infrastructure for enriching the new event trigger functionality, but it seems possibly useful for other purposes as well.	2012-12-23 18:37:58 -05:00
Tom Lane	da63fec7db	Add missing buffer lock acquisition in GetTupleForTrigger(). If we had not been holding buffer pin continuously since the tuple was initially fetched by the UPDATE or DELETE query, it would be possible for VACUUM or a page-prune operation to move the tuple while we're trying to copy it. This would result in a garbage "old" tuple value being passed to an AFTER ROW UPDATE or AFTER ROW DELETE trigger. The preconditions for this are somewhat improbable, and the timing constraints are very tight; so it's not so surprising that this hasn't been reported from the field, even though the bug has been there a long time. Problem found by Andres Freund. Back-patch to all active branches.	2012-11-30 13:55:55 -05:00
Kevin Grittner	6868ed7491	Throw error if expiring tuple is again updated or deleted. This prevents surprising behavior when a FOR EACH ROW trigger BEFORE UPDATE or BEFORE DELETE directly or indirectly updates or deletes the the old row. Prior to this patch the requested action on the row could be silently ignored while all triggered actions based on the occurence of the requested action could be committed. One example of how this could happen is if the BEFORE DELETE trigger for a "parent" row deleted "children" which had trigger functions to update summary or status data on the parent. This also prevents similar surprising problems if the query has a volatile function which updates a target row while it is already being updated. There are related issues present in FOR UPDATE cursors and READ COMMITTED queries which are not handled by this patch. These issues need further evalution to determine what change, if any, is needed. Where the new error messages are generated, in most cases the best fix will be to move code from the BEFORE trigger to an AFTER trigger. Where this is not feasible, the trigger can avoid the error by re-issuing the triggering statement and returning NULL. Documentation changes will be submitted in a separate patch. Kevin Grittner and Tom Lane with input from Florian Pflug and Robert Haas, based on problems encountered during conversion of Wisconsin Circuit Court trigger logic to plpgsql triggers.	2012-10-26 14:55:36 -05:00
Alvaro Herrera	c219d9b0a5	Split tuple struct defs from htup.h to htup_details.h This reduces unnecessary exposure of other headers through htup.h, which is very widely included by many files. I have chosen to move the function prototypes to the new file as well, because that means htup.h no longer needs to include tupdesc.h. In itself this doesn't have much effect in indirect inclusion of tupdesc.h throughout the tree, because it's also required by execnodes.h; but it's something to explore in the future, and it seemed best to do the htup.h change now while I'm busy with it.	2012-08-30 16:52:35 -04:00
Tom Lane	eaccfded98	Centralize the logic for detecting misplaced aggregates, window funcs, etc. Formerly we relied on checking after-the-fact to see if an expression contained aggregates, window functions, or sub-selects when it shouldn't. This is grotty, easily forgotten (indeed, we had forgotten to teach DefineIndex about rejecting window functions), and none too efficient since it requires extra traversals of the parse tree. To improve matters, define an enum type that classifies all SQL sub-expressions, store it in ParseState to show what kind of expression we are currently parsing, and make transformAggregateCall, transformWindowFuncCall, and transformSubLink check the expression type and throw error if the type indicates the construct is disallowed. This allows removal of a large number of ad-hoc checks scattered around the code base. The enum type is sufficiently fine-grained that we can still produce error messages of at least the same specificity as before. Bringing these error checks together revealed that we'd been none too consistent about phrasing of the error messages, so standardize the wording a bit. Also, rewrite checking of aggregate arguments so that it requires only one traversal of the arguments, rather than up to three as before. In passing, clean up some more comments left over from add_missing_from support, and annotate some tests that I think are dead code now that that's gone. (I didn't risk actually removing said dead code, though.)	2012-08-10 11:36:15 -04:00
Alvaro Herrera	f5bcd398ad	connoinherit may be true only for CHECK constraints The code was setting it true for other constraints, which is bogus. Doing so caused bogus catalog entries for such constraints, and in particular caused an error to be raised when trying to drop a constraint of types other than CHECK from a table that has children, such as reported in bug #6712. In 9.2, additionally ignore connoinherit=true for other constraint types, to avoid having to force initdb; existing databases might already contain bogus catalog entries. Includes a catversion bump (in HEAD only). Bug report from Miroslav Šulc Analysis from Amit Kapila and Noah Misch; Amit also contributed the patch.	2012-07-20 14:08:07 -04:00
Robert Haas	3a0e4d36eb	Make new event trigger facility actually do something. Commit `3855968f32` added syntax, pg_dump, psql support, and documentation, but the triggers didn't actually fire. With this commit, they now do. This is still a pretty basic facility overall because event triggers do not get a whole lot of information about what the user is trying to do unless you write them in C; and there's still no option to fire them anywhere except at the very beginning of the execution sequence, but it's better than nothing, and a good building block for future work. Along the way, add a regression test for ALTER LARGE OBJECT, since testing of event triggers reveals that we haven't got one. Dimitri Fontaine and Robert Haas	2012-07-20 11:39:01 -04:00
Peter Eisentraut	b8b2e3b2de	Replace int2/int4 in C code with int16/int32 The latter was already the dominant use, and it's preferable because in C the convention is that intXX means XX bits. Therefore, allowing mixed use of int2, int4, int8, int16, int32 is obviously confusing. Remove the typedefs for int2 and int4 for now. They don't seem to be widely used outside of the PostgreSQL source tree, and the few uses can probably be cleaned up by the time this ships.	2012-06-25 01:51:46 +03:00
Tom Lane	cfa0f4255b	Improve tests for whether we can skip queueing RI enforcement triggers. During an update of a PK row, we can skip firing the RI trigger if any old key value is NULL, because then the row could not have had any matching rows in the FK table. Conversely, during an update of an FK row, the outcome is determined if any new key value is NULL. In either case it becomes unnecessary to compare individual key values. This patch was inspired by discussion of Vik Reykja's patch to use IS NOT DISTINCT semantics for the key comparisons. In the event there is no need for that and so this patch looks nothing like his, but he should still get credit for having re-opened consideration of the trigger skip logic.	2012-06-19 20:07:33 -04:00
Tom Lane	f5297bdfe4	Refer to the default foreign key match style as MATCH SIMPLE internally. Previously we followed the SQL92 wording, "MATCH <unspecified>", but since SQL99 there's been a less awkward way to refer to the default style. In addition to the code changes, pg_constraint.confmatchtype now stores this match style as 's' (SIMPLE) rather than 'u' (UNSPECIFIED). This doesn't affect pg_dump or psql because they use pg_get_constraintdef() to reconstruct foreign key definitions. But other client-side code might examine that column directly, so this change will have to be marked as an incompatibility in the 9.3 release notes.	2012-06-17 20:16:44 -04:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Alvaro Herrera	09ff76fcdb	Recast "ONLY" column CHECK constraints as NO INHERIT The original syntax wasn't universally loved, and it didn't allow its usage in CREATE TABLE, only ALTER TABLE. It now works everywhere, and it also allows using ALTER TABLE ONLY to add an uninherited CHECK constraint, per discussion. The pg_constraint column has accordingly been renamed connoinherit. This commit partly reverts some of the changes in `61d81bd28d`, particularly some pg_dump and psql bits, because now pg_get_constraintdef includes the necessary NO INHERIT within the constraint definition. Author: Nikhil Sontakke Some tweaks by me	2012-04-20 23:56:57 -03:00
Robert Haas	07d1edb954	Extend object access hook framework to support arguments, and DROP. This allows loadable modules to get control at drop time, perhaps for the purpose of performing additional security checks or to log the event. The initial purpose of this code is to support sepgsql, but other applications should be possible as well. KaiGai Kohei, reviewed by me.	2012-03-09 14:34:56 -05:00
Tom Lane	891e6e7bfd	Require execute permission on the trigger function for CREATE TRIGGER. This check was overlooked when we added function execute permissions to the system years ago. For an ordinary trigger function it's not a big deal, since trigger functions execute with the permissions of the table owner, so they couldn't do anything the user issuing the CREATE TRIGGER couldn't have done anyway. However, if a trigger function is SECURITY DEFINER, that is not the case. The lack of checking would allow another user to install it on his own table and then invoke it with, essentially, forged input data; which the trigger function is unlikely to realize, so it might do something undesirable, for instance insert false entries in an audit log table. Reported by Dinesh Kumar, patch by Robert Haas Security: CVE-2012-0866	2012-02-23 15:38:56 -05:00
Alvaro Herrera	74ab96a45e	Add pg_trigger_depth() function This reports the depth level of triggers currently in execution, or zero if not called from inside a trigger. No catversion bump in this patch, but you have to initdb if you want access to the new function. Author: Kevin Grittner	2012-01-25 13:22:54 -03:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Alvaro Herrera	61d81bd28d	Allow CHECK constraints to be declared ONLY This makes them enforceable only on the parent table, not on children tables. This is useful in various situations, per discussion involving people bitten by the restrictive behavior introduced in 8.4. Message-Id: 8762mp93iw.fsf@comcast.net CAFaPBrSMMpubkGf4zcRL_YL-AERUbYF_-ZNNYfb3CVwwEqc9TQ@mail.gmail.com Authors: Nikhil Sontakke, Alex Hunsaker Reviewed by Robert Haas and myself	2011-12-19 17:30:23 -03:00
Robert Haas	74a1d4fe7c	Improve behavior of concurrent rename statements. Previously, renaming a table, sequence, view, index, foreign table, column, or trigger checked permissions before locking the object, which meant that if permissions were revoked during the lock wait, we would still allow the operation. Similarly, if the original object is dropped and a new one with the same name is created, the operation will be allowed if we had permissions on the old object; the permissions on the new object don't matter. All this is now fixed. Along the way, attempting to rename a trigger on a foreign table now gives the same error message as trying to create one there in the first place (i.e. that it's not a table or view) rather than simply stating that no trigger by that name exists. Patch by me; review by Noah Misch.	2011-12-15 19:02:38 -05:00
Robert Haas	2ad36c4e44	Improve table locking behavior in the face of current DDL. In the previous coding, callers were faced with an awkward choice: look up the name, do permissions checks, and then lock the table; or look up the name, lock the table, and then do permissions checks. The first choice was wrong because the results of the name lookup and permissions checks might be out-of-date by the time the table lock was acquired, while the second allowed a user with no privileges to interfere with access to a table by users who do have privileges (e.g. if a malicious backend queues up for an AccessExclusiveLock on a table on which AccessShareLock is already held, further attempts to access the table will be blocked until the AccessExclusiveLock is obtained and the malicious backend's transaction rolls back). To fix, allow callers of RangeVarGetRelid() to pass a callback which gets executed after performing the name lookup but before acquiring the relation lock. If the name lookup is retried (because invalidation messages are received), the callback will be re-executed as well, so we get the best of both worlds. RangeVarGetRelid() is renamed to RangeVarGetRelidExtended(); callers not wishing to supply a callback can continue to invoke it as RangeVarGetRelid(), which is now a macro. Since the only one caller that uses nowait = true now passes a callback anyway, the RangeVarGetRelid() macro defaults nowait as well. The callback can also be used for supplemental locking - for example, REINDEX INDEX needs to acquire the table lock before the index lock to reduce deadlock possibilities. There's a lot more work to be done here to fix all the cases where this can be a problem, but this commit provides the general infrastructure and fixes the following specific cases: REINDEX INDEX, REINDEX TABLE, LOCK TABLE, and and DROP TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE. Per discussion with Noah Misch and Alvaro Herrera.	2011-11-30 10:27:00 -05:00
Robert Haas	fc6d1006bd	Further consolidation of DROP statement handling. This gets rid of an impressive amount of duplicative code, with only minimal behavior changes. DROP FOREIGN DATA WRAPPER now requires object ownership rather than superuser privileges, matching the documentation we already have. We also eliminate the historical warning about dropping a built-in function as unuseful. All operations are now performed in the same order for all object types handled by dropcmds.c. KaiGai Kohei, with minor revisions by me	2011-11-17 21:32:34 -05:00
Tom Lane	5ac5980744	More cleanup after failed reduced-lock-levels-for-DDL feature. Turns out that use of ShareUpdateExclusiveLock or ShareRowExclusiveLock to protect DDL changes had gotten copied into several places that were not touched by either of Simon's original patches for the feature, and thus neither he nor I thought to revert them. (Indeed, it appears that two of these uses were committed after the reversion, which just goes to show that git merging is no panacea.) Change these places to use AccessExclusiveLock again. If we ever manage to resurrect that feature, we're going to have to think a bit harder about how to keep lock level usage in sync for DDL operations that aren't within the AlterTable infrastructure. Two of these bugs are only in HEAD, but one is in the 9.1 branch too. Alvaro found one of them, I found the other two.	2011-10-21 13:50:30 -04:00
Tom Lane	a0185461dd	Rearrange the implementation of index-only scans. This commit changes index-only scans so that data is read directly from the index tuple without first generating a faux heap tuple. The only immediate benefit is that indexes on system columns (such as OID) can be used in index-only scans, but this is necessary infrastructure if we are ever to support index-only scans on expression indexes. The executor is now ready for that, though the planner still needs substantial work to recognize the possibility. To do this, Vars in index-only plan nodes have to refer to index columns not heap columns. I introduced a new special varno, INDEX_VAR, to mark such Vars to avoid confusion. (In passing, this commit renames the two existing special varnos to OUTER_VAR and INNER_VAR.) This allows ruleutils.c to handle them with logic similar to what we use for subplan reference Vars. Since index-only scans are now fundamentally different from regular indexscans so far as their expression subtrees are concerned, I also chose to change them to have their own plan node type (and hence, their own executor source file).	2011-10-11 14:21:30 -04:00
Tom Lane	f197272365	Make EXPLAIN ANALYZE report the numbers of rows rejected by filter steps. This provides information about the numbers of tuples that were visited but not returned by table scans, as well as the numbers of join tuples that were considered and discarded within a join plan node. There is still some discussion going on about the best way to report counts for outer-join situations, but I think most of what's in the patch would not change if we revise that, so I'm going to go ahead and commit it as-is. Documentation changes to follow (they weren't in the submitted patch either). Marko Tiikkaja, reviewed by Marc Cousin, somewhat revised by Tom	2011-09-22 11:30:11 -04:00
Tom Lane	b33f78df17	Fix trigger WHEN conditions when both BEFORE and AFTER triggers exist. Due to tuple-slot mismanagement, evaluation of WHEN conditions for AFTER ROW UPDATE triggers could crash if there had been a BEFORE ROW trigger fired for the same update. Fix by not trying to overload the use of estate->es_trig_tuple_slot. Per report from Yoran Heling. Back-patch to 9.0, when trigger WHEN conditions were introduced.	2011-08-21 18:15:55 -04:00
Tom Lane	1af37ec96d	Replace errdetail("%s", ...) with errdetail_internal("%s", ...). There may be some other places where we should use errdetail_internal, but they'll have to be evaluated case-by-case. This commit just hits a bunch of places where invoking gettext is obviously a waste of cycles.	2011-07-16 14:22:18 -04:00
Tom Lane	c1d9579dd8	Avoid listing ungrouped Vars in the targetlist of Agg-underneath-Window. Regular aggregate functions in combination with, or within the arguments of, window functions are OK per spec; they have the semantics that the aggregate output rows are computed and then we run the window functions over that row set. (Thus, this combination is not really useful unless there's a GROUP BY so that more than one aggregate output row is possible.) The case without GROUP BY could fail, as recently reported by Jeff Davis, because sloppy construction of the Agg node's targetlist resulted in extra references to possibly-ungrouped Vars appearing outside the aggregate function calls themselves. See the added regression test case for an example. Fixing this requires modifying the API of flatten_tlist and its underlying function pull_var_clause. I chose to make pull_var_clause's API for aggregates identical to what it was already doing for placeholders, since the useful behaviors turn out to be the same (error, report node as-is, or recurse into it). I also tightened the error checking in this area a bit: if it was ever valid to see an uplevel Var, Aggref, or PlaceHolderVar here, that was a long time ago, so complain instead of ignoring them. Backpatch into 9.1. The failure exists in 8.4 and 9.0 as well, but seeing that it only occurs in a basically-useless corner case, it doesn't seem worth the risks of changing a function API in a minor release. There might be third-party code using pull_var_clause.	2011-07-12 18:24:39 -04:00
Robert Haas	4240e429d0	Try to acquire relation locks in RangeVarGetRelid. In the previous coding, we would look up a relation in RangeVarGetRelid, lock the resulting OID, and then AcceptInvalidationMessages(). While this was sufficient to ensure that we noticed any changes to the relation definition before building the relcache entry, it didn't handle the possibility that the name we looked up no longer referenced the same OID. This was particularly problematic in the case where a table had been dropped and recreated: we'd latch on to the entry for the old relation and fail later on. Now, we acquire the relation lock inside RangeVarGetRelid, and retry the name lookup if we notice that invalidation messages have been processed meanwhile. Many operations that would previously have failed with an error in the presence of concurrent DDL will now succeed. There is a good deal of work remaining to be done here: many callers of RangeVarGetRelid still pass NoLock for one reason or another. In addition, nothing in this patch guards against the possibility that the meaning of an unqualified name might change due to the creation of a relation in a schema earlier in the user's search path than the one where it was previously found. Furthermore, there's nothing at all here to guard against similar race conditions for non-relations. For all that, it's a start. Noah Misch and Robert Haas	2011-07-08 22:19:30 -04:00
Tom Lane	a195e3c34f	Finish disabling reduced-lock-levels-for-DDL feature. Previous patch only covered the ALTER TABLE changes, not changes in other commands; and it neglected to revert the documentation changes.	2011-07-07 13:15:15 -04:00
Alvaro Herrera	b93f5a5673	Move Trigger and TriggerDesc structs out of rel.h into a new reltrigger.h This lets us stop including rel.h into execnodes.h, which is a widely used header.	2011-07-04 14:35:58 -04:00
Tom Lane	e1ccaff6ee	Rework parsing of ConstraintAttributeSpec to improve NOT VALID handling. The initial commit of the ALTER TABLE ADD FOREIGN KEY NOT VALID feature failed to support labeling such constraints as deferrable. The best fix for this seems to be to fold NOT VALID into ConstraintAttributeSpec. That's a bit more general than the documented syntax, but it allows better-targeted syntax error messages. In addition, do some mostly-but-not-entirely-cosmetic code review for the whole NOT VALID patch.	2011-06-15 19:06:21 -04:00
Tom Lane	d64713df7e	Pass collations to functions in FunctionCallInfoData, not FmgrInfo. Since collation is effectively an argument, not a property of the function, FmgrInfo is really the wrong place for it; and this becomes critical in cases where a cached FmgrInfo is used for varying purposes that might need different collation settings. Fix by passing it in FunctionCallInfoData instead. In particular this allows a clean fix for bug #5970 (record_cmp not working). This requires touching a bit more code than the original method, but nobody ever thought that collations would not be an invasive patch...	2011-04-12 19:19:24 -04:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Tom Lane	d8d429890d	Fix collations when we call transformWhereClause from outside the parser. Previous patches took care of assorted places that call transformExpr from outside the main parser, but I overlooked the fact that some places use transformWhereClause as a shortcut for transformExpr + coerce_to_boolean. In particular this broke collation-sensitive index WHERE clauses, as per report from Thom Brown. Trigger WHEN and rule WHERE clauses too. I'm not forcing initdb for this fix, but any affected indexes, triggers, or rules will need to be dropped and recreated.	2011-04-07 02:34:57 -04:00
Tom Lane	a874fe7b4c	Refactor the executor's API to support data-modifying CTEs better. The originally committed patch for modifying CTEs didn't interact well with EXPLAIN, as noted by myself, and also had corner-case problems with triggers, as noted by Dean Rasheed. Those problems show it is really not practical for ExecutorEnd to call any user-defined code; so split the cleanup duties out into a new function ExecutorFinish, which must be called between the last ExecutorRun call and ExecutorEnd. Some Asserts have been added to these functions to help verify correct usage. It is no longer necessary for callers of the executor to call AfterTriggerBeginQuery/AfterTriggerEndQuery for themselves, as this is now done by ExecutorStart/ExecutorFinish respectively. If you really need to suppress that and do it for yourself, pass EXEC_FLAG_SKIP_TRIGGERS to ExecutorStart. Also, refactor portal commit processing to allow for the possibility that PortalDrop will invoke user-defined code. I think this is not actually necessary just yet, since the portal-execution-strategy logic forces any non-pure-SELECT query to be run to completion before we will consider committing. But it seems like good future-proofing.	2011-02-27 13:44:12 -05:00
Tom Lane	a210be7720	Fix dangling-pointer problem in before-row update trigger processing. ExecUpdate checked for whether ExecBRUpdateTriggers had returned a new tuple value by seeing if the returned tuple was pointer-equal to the old one. But the "old one" was in estate->es_junkFilter's result slot, which would be scribbled on if we had done an EvalPlanQual update in response to a concurrent update of the target tuple; therefore we were comparing a dangling pointer to a live one. Given the right set of circumstances we could get a false match, resulting in not forcing the tuple to be stored in the slot we thought it was stored in. In the case reported by Maxim Boguk in bug #5798, this led to "cannot extract system attribute from virtual tuple" failures when trying to do "RETURNING ctid". I believe there is a very-low-probability chance of more serious errors, such as generating incorrect index entries based on the original rather than the trigger-modified version of the row. In HEAD, change all of ExecBRInsertTriggers, ExecIRInsertTriggers, ExecBRUpdateTriggers, and ExecIRUpdateTriggers so that they continue to have similar APIs. In the back branches I just changed ExecBRUpdateTriggers, since there is no bug in the ExecBRInsertTriggers case.	2011-02-21 21:19:50 -05:00
Simon Riggs	722bf7017b	Extend ALTER TABLE to allow Foreign Keys to be added without initial validation. FK constraints that are marked NOT VALID may later be VALIDATED, which uses an ShareUpdateExclusiveLock on constraint table and RowShareLock on referenced table. Significantly reduces lock strength and duration when adding FKs. New state visible from psql. Simon Riggs, with reviews from Marko Tiikkaja and Robert Haas	2011-02-08 12:23:20 +00:00
Bruce Momjian	5d950e3b0c	Stamp copyrights for year 2011.	2011-01-01 13:18:15 -05:00
Robert Haas	cc1ed40d57	Object access hook framework, with post-creation hook. After a SQL object is created, we provide an opportunity for security or logging plugins to get control; for example, a security label provider could use this to assign an initial security label to newly created objects. The basic infrastructure is (hopefully) reusable for other types of events that might require similar treatment. KaiGai Kohei, with minor adjustments.	2010-11-25 11:50:13 -05:00
Tom Lane	2ec993a7cb	Support triggers on views. This patch adds the SQL-standard concept of an INSTEAD OF trigger, which is fired instead of performing a physical insert/update/delete. The trigger function is passed the entire old and/or new rows of the view, and must figure out what to do to the underlying tables to implement the update. So this feature can be used to implement updatable views using trigger programming style rather than rule hacking. In passing, this patch corrects the names of some columns in the information_schema.triggers view. It seems the SQL committee renamed them somewhere between SQL:99 and SQL:2003. Dean Rasheed, reviewed by Bernd Helmle; some additional hacking by me.	2010-10-10 13:45:07 -04:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Joe Conway	5eb15c9942	SERIALIZABLE transactions are actually implemented beneath the covers with transaction snapshots, i.e. a snapshot registered at the beginning of a transaction. Change variable naming and comments to reflect this reality in preparation for a future, truly serializable mode, e.g. Serializable Snapshot Isolation (SSI). For the moment transaction snapshots are still used to implement SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin Grittner and Dan Ports with review and some minor wording changes by me.	2010-09-11 18:38:58 +00:00
Tom Lane	e275d16a54	Fix possible corruption of AfterTriggerEventLists in subtransaction rollback. afterTriggerInvokeEvents failed to adjust events->tailfree when truncating the last chunk of an event list. This could result in the data being "de-truncated" by afterTriggerRestoreEventList during a subsequent subtransaction abort. Even that wouldn't kill us, because the re-added data would just be events marked DONE --- unless the data had been partially overwritten by new events. Then we might crash, or in any case misbehave (perhaps fire triggers twice, or fire triggers with the wrong event data). Per bug #5622 from Thue Janus Kristensen. Back-patch to 8.4 where the current trigger list representation was introduced.	2010-08-19 15:46:18 +00:00
Robert Haas	fd1843ff89	Standardize get_whatever_oid functions for other object types. - Rename TSParserGetPrsid to get_ts_parser_oid. - Rename TSDictionaryGetDictid to get_ts_dict_oid. - Rename TSTemplateGetTmplid to get_ts_template_oid. - Rename TSConfigGetCfgid to get_ts_config_oid. - Rename FindConversionByName to get_conversion_oid. - Rename GetConstraintName to get_constraint_oid. - Add new functions get_opclass_oid, get_opfamily_oid, get_rewrite_oid, get_rewrite_oid_without_relid, get_trigger_oid, and get_cast_oid. The name of each function matches the corresponding catalog. Thanks to KaiGai Kohei for the review.	2010-08-05 15:25:36 +00:00
Simon Riggs	2dbbda02e7	Reduce lock levels of CREATE TRIGGER and some ALTER TABLE, CREATE RULE actions. Avoid hard-coding lockmode used for many altering DDL commands, allowing easier future changes of lock levels. Implementation of initial analysis on DDL sub-commands, so that many lock levels are now at ShareUpdateExclusiveLock or ShareRowExclusiveLock, allowing certain DDL not to block reads/writes. First of number of planned changes in this area; additional docs required when full project complete.	2010-07-28 05:22:24 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Robert Haas	e26c539e9f	Wrap calls to SearchSysCache and related functions using macros. The purpose of this change is to eliminate the need for every caller of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists, GetSysCacheOid, and SearchSysCacheList to know the maximum number of allowable keys for a syscache entry (currently 4). This will make it far easier to increase the maximum number of keys in a future release should we choose to do so, and it makes the code shorter, too. Design and review by Tom Lane.	2010-02-14 18:42:19 +00:00
Tom Lane	875353b99f	Fix assorted core dumps and Assert failures that could occur during AbortTransaction or AbortSubTransaction, when trying to clean up after an error that prevented (sub)transaction start from completing: * access to TopTransactionResourceOwner that might not exist * assert failure in AtEOXact_GUC, if AtStart_GUC not called yet * assert failure or core dump in AfterTriggerEndSubXact, if AfterTriggerBeginSubXact not called yet Per testing by injecting elog(ERROR) at successive steps in StartTransaction and StartSubTransaction. It's not clear whether all of these cases could really occur in the field, but at least one of them is easily exposed by simple stress testing, as per my accidental discovery yesterday.	2010-01-24 21:49:17 +00:00
Tom Lane	9a915e596f	Improve the handling of SET CONSTRAINTS commands by having them search pg_constraint before searching pg_trigger. This allows saner handling of corner cases; in particular we now say "constraint is not deferrable" rather than "constraint does not exist" when the command is applied to a constraint that's inherently non-deferrable. Per a gripe several months ago from hubert depesz lubaczewski. To make this work without breaking user-defined constraint triggers, we have to add entries for them to pg_constraint. However, in return we can remove the pgconstrname column from pg_constraint, which represents a fairly sizable space savings. I also replaced the tgisconstraint column with tgisinternal; the old meaning of tgisconstraint can now be had by testing for nonzero tgconstraint, while there is no other way to get the old meaning of nonzero tgconstraint, namely that the trigger was internally generated rather than being user-created. In passing, fix an old misstatement in the docs and comments, namely that pg_trigger.tgdeferrable is exactly redundant with pg_constraint.condeferrable. Actually, we mark RI action triggers as nondeferrable even when they belong to a nominally deferrable FK constraint. The SET CONSTRAINTS code now relies on that instead of hard-coding a list of exception OIDs.	2010-01-17 22:56:23 +00:00
Bruce Momjian	0239800893	Update copyright for the year 2010.	2010-01-02 16:58:17 +00:00
Tom Lane	7fc0f06221	Add a WHEN clause to CREATE TRIGGER, allowing a boolean expression to be checked to determine whether the trigger should be fired. For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER triggers it can provide a noticeable performance improvement, since queuing of a deferred trigger event and re-fetching of the row(s) at end of statement can be short-circuited if the trigger does not need to be fired. Takahiro Itagaki, reviewed by KaiGai Kohei.	2009-11-20 20:38:12 +00:00
Tom Lane	44956c52c5	Fix AfterTriggerSaveEvent to use a test and elog, not just Assert, to check that it's called within an AfterTriggerBeginQuery/AfterTriggerEndQuery pair. The RI cascade triggers suppress that overhead on the assumption that they are always run non-deferred, so it's possible to violate the condition if someone mistakenly changes pg_trigger to mark such a trigger deferred. We don't really care about supporting that, but throwing an error instead of crashing seems desirable. Per report from Marcelo Costa.	2009-10-27 20:14:27 +00:00
Tom Lane	9f2ee8f287	Re-implement EvalPlanQual processing to improve its performance and eliminate a lot of strange behaviors that occurred in join cases. We now identify the "current" row for every joined relation in UPDATE, DELETE, and SELECT FOR UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the appropriate row into each scan node in the rechecking plan, forcing it to emit only that one row. The former behavior could rescan the whole of each joined relation for each recheck, which was terrible for performance, and what's much worse could result in duplicated output tuples. Also, the original implementation of EvalPlanQual could not re-use the recheck execution tree --- it had to go through a full executor init and shutdown for every row to be tested. To avoid this overhead, I've associated a special runtime Param with each LockRows or ModifyTable plan node, and arranged to make every scan node below such a node depend on that Param. Thus, by signaling a change in that Param, the EPQ machinery can just rescan the already-built test plan. This patch also adds a prohibition on set-returning functions in the targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the duplicate-output-tuple problem. It seems fairly reasonable since the other restrictions on SELECT FOR UPDATE are meant to ensure that there is a unique correspondence between source tuples and result tuples, which an output SRF destroys as much as anything else does.	2009-10-26 02:26:45 +00:00
Tom Lane	b2734a0d79	Support SQL-compliant triggers on columns, ie fire only if certain columns are named in the UPDATE's SET list. Note: the schema of pg_trigger has not actually changed; we've just started to use a column that was there all along. catversion bumped anyway so that this commit is included in the history of potentially interesting changes to system catalog contents. Itagaki Takahiro	2009-10-14 22:14:25 +00:00
Tom Lane	8a5849b7ff	Split the processing of INSERT/UPDATE/DELETE operations out of execMain.c. They are now handled by a new plan node type called ModifyTable, which is placed at the top of the plan tree. In itself this change doesn't do much, except perhaps make the handling of RETURNING lists and inherited UPDATEs a tad less klugy. But it is necessary preparation for the intended extension of allowing RETURNING queries inside WITH. Marko Tiikkaja	2009-10-10 01:43:50 +00:00
Tom Lane	a2a8c7a662	Support hex-string input and output for type BYTEA. Both hex format and the traditional "escape" format are automatically handled on input. The output format is selected by the new GUC variable bytea_output. As committed, bytea_output defaults to HEX, which is an incompatible change. We will keep it this way for awhile for testing purposes, but should consider whether to switch to the more backwards-compatible default of ESCAPE before 8.5 is released. Peter Eisentraut	2009-08-04 16:08:37 +00:00
Tom Lane	060baf2784	Merge the Constraint and FkConstraint node types into a single type. This was foreseen to be a good idea long ago, but nobody had got round to doing it. The recent patch for deferred unique constraints made transformConstraintAttrs() ugly enough that I decided it was time. This change will also greatly simplify parsing of deferred CHECK constraints, if anyone ever gets around to implementing that. While at it, add a location field to Constraint, and use that to provide an error cursor for some of the constraint-related error messages.	2009-07-30 02:45:38 +00:00
Tom Lane	25d9bf2e3e	Support deferrable uniqueness constraints. The current implementation fires an AFTER ROW trigger for each tuple that looks like it might be non-unique according to the index contents at the time of insertion. This works well as long as there aren't many conflicts, but won't scale to massive unique-key reassignments. Improving that case is a TODO item. Dean Rasheed	2009-07-29 20:56:21 +00:00
Tom Lane	c1b9ec24ef	Add system catalog columns pg_constraint.conindid and pg_trigger.tgconstrindid. conindid is the index supporting a constraint. We can use this not only for unique/primary-key constraints, but also foreign-key constraints, which depend on the unique index that constrains the referenced columns. tgconstrindid is just copied from the constraint's conindid field, or is zero for triggers not associated with constraints. This is mainly intended as infrastructure for upcoming patches, but it has some virtue in itself, since it exposes a relationship that you formerly had to grovel in pg_depend to determine. I simplified one information_schema view accordingly. (There is a pg_dump query that could also use conindid, but I left it alone because it wasn't clear it'd get any faster.)	2009-07-28 02:56:31 +00:00
Tom Lane	f08e5e92e8	Fix the just-reported problem that you can't specify all four trigger event types in CREATE TRIGGER. While at it, clean up the amazingly tedious and inextensible way that the trigger event type list was handled. Per report from Greg Sabino Mullane.	2009-06-18 01:27:02 +00:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Tom Lane	3cb5d6580a	Support column-level privileges, as required by SQL standard. Stephen Frost, with help from KaiGai Kohei and others	2009-01-22 20:16:10 +00:00
Heikki Linnakangas	c079090bbc	Update comments to reflect that tgenabled is not a boolean anymore. Jonah Harris, with minor tinkering by me.	2009-01-22 19:16:31 +00:00
Magnus Hagander	fa40ca42a6	Make some strings translatable again that were accidentally removed in earlier patch to fix "printf-arguments".	2009-01-21 09:28:26 +00:00
Bruce Momjian	511db38ace	Update copyright for 2009.	2009-01-01 17:24:05 +00:00
Tom Lane	c98a923786	Fix failure to ensure that a snapshot is available to datatype input functions when they are invoked by the parser. We had been setting up a snapshot at plan time but really it needs to be done earlier, before parse analysis. Per report from Dmitry Koterov. Also fix two related problems discovered while poking at this one: exec_bind_message called datatype input functions without establishing a snapshot, and SET CONSTRAINTS IMMEDIATE could call trigger functions without establishing a snapshot. Backpatch to 8.2. The underlying problem goes much further back, but it is masked in 8.1 and before because we didn't attempt to invoke domain check constraints within datatype input. It would only be exposed if a C-language datatype input function used the snapshot; which evidently none do, or we'd have heard complaints sooner. Since this code has changed a lot over time, a back-patch is hardly risk-free, and so I'm disinclined to patch further than absolutely necessary.	2008-12-13 02:00:20 +00:00

1 2 3 4 5 ...

491 Commits