postgresql

Commit Graph

Author	SHA1	Message	Date
Andres Freund	15d8f83128	Verify that expected slot types match returned slot types. This is important so JIT compilation knows what kind of tuple slot the deforming routine can expect. There's also optimization potential for expression initialization without JIT compilation. It e.g. seems plausible to elide EEOP_*_FETCHSOME ops entirely when dealing with virtual slots. Author: Andres Freund Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de	2018-11-15 22:00:30 -08:00
Andres Freund	675af5c01e	Compute information about EEOP_*_FETCHSOME at expression init time. Previously this information was computed when JIT compiling an expression. But the information is useful for assertions in the non-JIT case too (for assertions), therefore it makes sense to move it. This will, in a followup commit, allow to treat different slot types differently. E.g. for virtual slots there's no need to generate a JIT function to deform the slot. Author: Andres Freund Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de	2018-11-15 22:00:30 -08:00
Andres Freund	1a0586de36	Introduce notion of different types of slots (without implementing them). Upcoming work intends to allow pluggable ways to introduce new ways of storing table data. Accessing those table access methods from the executor requires TupleTableSlots to be carry tuples in the native format of such storage methods; otherwise there'll be a significant conversion overhead. Different access methods will require different data to store tuples efficiently (just like virtual, minimal, heap already require fields in TupleTableSlot). To allow that without requiring additional pointer indirections, we want to have different structs (embedding TupleTableSlot) for different types of slots. Thus different types of slots are needed, which requires adapting creators of slots. The slot that most efficiently can represent a type of tuple in an executor node will often depend on the type of slot a child node uses. Therefore we need to track the type of slot is returned by nodes, so parent slots can create slots based on that. Relatedly, JIT compilation of tuple deforming needs to know which type of slot a certain expression refers to, so it can create an appropriate deforming function for the type of tuple in the slot. But not all nodes will only return one type of slot, e.g. an append node will potentially return different types of slots for each of its subplans. Therefore add function that allows to query the type of a node's result slot, and whether it'll always be the same type (whether it's fixed). This can be queried using ExecGetResultSlotOps(). The scan, result, inner, outer type of slots are automatically inferred from ExecInitScanTupleSlot(), ExecInitResultSlot(), left/right subtrees respectively. If that's not correct for a node, that can be overwritten using new fields in PlanState. This commit does not introduce the actually abstracted implementation of different kind of TupleTableSlots, that will be left for a followup commit. The different types of slots introduced will, for now, still use the same backing implementation. While this already partially invalidates the big comment in tuptable.h, it seems to make more sense to update it later, when the different TupleTableSlot implementations actually exist. Author: Ashutosh Bapat and Andres Freund, with changes by Amit Khandekar Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de	2018-11-15 22:00:30 -08:00
Andres Freund	763f2edd92	Rejigger materializing and fetching a HeapTuple from a slot. Previously materializing a slot always returned a HeapTuple. As current work aims to reduce the reliance on HeapTuples (so other storage systems can work efficiently), that needs to change. Thus split the tasks of materializing a slot (i.e. making it independent from the underlying storage / other memory contexts) from fetching a HeapTuple from the slot. For brevity, allow to fetch a HeapTuple from a slot and materializing the slot at the same time, controlled by a parameter. For now some callers of ExecFetchSlotHeapTuple, with materialize = true, expect that changes to the heap tuple will be reflected in the underlying slot. Those places will be adapted in due course, so while not pretty, that's OK for now. Also rename ExecFetchSlotTuple to ExecFetchSlotHeapTupleDatum and ExecFetchSlotTupleDatum to ExecFetchSlotHeapTupleDatum, as it's likely that future storage methods will need similar methods. There already is ExecFetchSlotMinimalTuple, so the new names make the naming scheme more coherent. Author: Ashutosh Bapat and Andres Freund, with changes by Amit Khandekar Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de	2018-11-15 14:31:12 -08:00
Peter Eisentraut	86a4819f69	Update executor documentation for run-time partition pruning With run-time partition pruning, there is no longer necessarily an executor node for each corresponding plan node. Author: David Rowley <david.rowley@2ndquadrant.com>	2018-11-15 23:02:21 +01:00
Andres Freund	c058fc2a2b	Rationalize expression context reset in ExecModifyTable(). The current pattern of reseting expressions both in ExecProcessReturning() and ExecOnConflictUpdate() makes it harder than necessary to reason about memory lifetimes. It also requires materializing slots unnecessarily, although this patch doesn't take advantage of the fact that that's not necessary anymore. Instead reset the expression context once for each input tuple. Author: Ashutosh Bapat Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de	2018-11-15 11:40:50 -08:00
Tom Lane	34c9e455d0	Improve performance of partition pruning remapping a little. ExecFindInitialMatchingSubPlans has to update the PartitionPruneState's subplan mapping data to account for the removal of subplans it prunes. However, that's only necessary if run-time pruning will also occur, so we can skip it when that won't happen, which should result in not needing to do the remapping in many cases. (We now need it only when some partitions are potentially startup-time prunable and others are potentially run-time prunable, which seems like an unusual case.) Also make some marginal performance improvements in the remapping itself. These will mainly win if most partitions got pruned by the startup-time pruning, which is perhaps a debatable assumption in this context. Also fix some bogus comments, and rearrange code to marginally reduce space consumption in the executor's query-lifespan context. David Rowley, reviewed by Yoshikazu Imai Discussion: https://postgr.es/m/CAKJS1f9+m6-di-zyy4B4AGn0y1B9F8UKDRigtBbNviXOkuyOpw@mail.gmail.com	2018-11-15 13:34:16 -05:00
Alvaro Herrera	74514bd4a5	geo_ops.c: Clarify comments and function arguments These functions were not crystal clear about what their respective APIs are. Make an effort to improve that. Emre's patch was correct AFAICT, but I (Álvaro) felt the need to improve a few comments a bit more. Any resulting errors are my own. Per complaint from Coverity, Ning Yu, and Tom Lane. Author: Emre Hasegeli, Álvaro Herrera Reviewed-by: Tomas Vondra, Álvaro Herrera Discussion: https://postgr.es/m/26769.1533090136@sss.pgh.pa.us	2018-11-15 12:08:03 -03:00
Thomas Munro	ab8984f52d	Further adjustment to random() seed initialization. Per complaint from Tom Lane, don't chomp the timestamp at 32 bits, so we can shift in some of its higher bits. Discussion: https://postgr.es/m/14712.1542253115%40sss.pgh.pa.us	2018-11-15 17:38:55 +13:00
Thomas Munro	5b0ce3ec33	Increase the number of possible random seeds per time period. Commit `197e4af9` refactored the initialization of the libc random() seed, but reduced the number of possible seeds values that could be chosen in a given time period. This negation of the effects of commit `98c50656c` was unintentional. Replace with code that shifts the fast moving timestamp bits left, similar to the original algorithm (though not the previous float-tolerating coding, which is no longer necessary). Author: Thomas Munro Reported-by: Noah Misch Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20181112083358.GA1073796%40rfd.leadboat.com	2018-11-15 16:25:30 +13:00
Thomas Munro	aa55183042	Use 64 bit type for BufFileSize(). BufFileSize() can't use off_t, because it's only 32 bits wide on some systems. BufFile objects can have many 1GB segments so the total size can exceed 2^31. The only known client of the function is parallel CREATE INDEX, which was reported to fail when building large indexes on Windows. Though this is technically an ABI break on platforms with a 32 bit off_t and we might normally avoid back-patching it, the function is brand new and thus unlikely to have been discovered by extension authors yet, and it's fairly thoroughly broken on those platforms anyway, so just fix it. Defect in `9da0cc35`. Bug #15460. Back-patch to 11, where this function landed. Author: Thomas Munro Reported-by: Paul van der Linden, Pavel Oskin Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/15460-b6db80de822fa0ad%40postgresql.org Discussion: https://postgr.es/m/CAHDGBJP_GsESbTt4P3FZA8kMUKuYxjg57XHF7NRBoKnR%3DCAR-g%40mail.gmail.com	2018-11-15 13:13:57 +13:00
Tom Lane	600b04d6b5	Add a timezone-specific variant of date_trunc(). date_trunc(field, timestamptz, zone_name) performs truncation using the named time zone as reference, rather than working in the session time zone as is the default behavior. It's equivalent to date_trunc(field, timestamptz at time zone zone_name) at time zone zone_name but it's faster, easier to type, and arguably easier to understand. Vik Fearing and Tom Lane Discussion: https://postgr.es/m/6249ffc4-2b22-4c1b-4e7d-7af84fedd7c6@2ndquadrant.com	2018-11-14 15:41:07 -05:00
Peter Eisentraut	1b5d797cd4	Lower lock level for renaming indexes Change lock level for renaming index (either ALTER INDEX or implicitly via some other commands) from AccessExclusiveLock to ShareUpdateExclusiveLock. One reason we need a strong lock for relation renaming is that the name change causes a rebuild of the relcache entry. Concurrent sessions that have the relation open might not be able to handle the relcache entry changing underneath them. Therefore, we need to lock the relation in a way that no one can have the relation open concurrently. But for indexes, the relcache handles reloads specially in RelationReloadIndexInfo() in a way that keeps changes in the relcache entry to a minimum. As long as no one keeps pointers to rd_amcache and rd_options around across possible relcache flushes, which is the case, this ought to be safe. We also want to use a self-exclusive lock for correctness, so that concurrent DDL doesn't overwrite the rename if they start updating while still seeing the old version. Therefore, we use ShareUpdateExclusiveLock, which is already used by other DDL commands that want to operate in a concurrent manner. The reason this is interesting at all is that renaming an index is a typical part of a concurrent reindexing workflow (CREATE INDEX CONCURRENTLY new + DROP INDEX CONCURRENTLY old + rename back). And indeed a future built-in REINDEX CONCURRENTLY might rely on the ability to do concurrent renames as well. Reviewed-by: Andrey Klychkov <aaklychkov@mail.ru> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1531767486.432607658@f357.i.mail.ru	2018-11-14 17:09:54 +01:00
Michael Paquier	b4721f3950	Initialize TransactionState and user ID consistently at transaction start If a failure happens when a transaction is starting between the moment the transaction status is changed from TRANS_DEFAULT to TRANS_START and the moment the current user ID and security context flags are fetched via GetUserIdAndSecContext(), or before initializing its basic fields, then those may get reset to incorrect values when the transaction aborts, leaving the session in an inconsistent state. One problem reported is that failing a starting transaction at the first query of a session could cause several kinds of system crashes on the follow-up queries. In order to solve that, move the initialization of the transaction state fields and the call of GetUserIdAndSecContext() in charge of fetching the current user ID close to the point where the transaction status is switched to TRANS_START, where there cannot be any error triggered in-between, per an idea of Tom Lane. This properly ensures that the current user ID, the security context flags and that the basic fields of TransactionState remain consistent even if the transaction fails while starting. Reported-by: Richard Guo Diagnosed-By: Richard Guo Author: Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAN_9JTxECSb=pEPcb0a8d+6J+bDcOZ4=DgRo_B7Y5gRHJUM=Rw@mail.gmail.com Backpatch-through: 9.4	2018-11-14 16:46:53 +09:00
Michael Paquier	3be97b97ed	Add flag values in WAL description to all heap records Hexadecimal is consistently used as format to not bloat too much the output but keep it readable. This information is useful mainly for debugging purposes with for example pg_waldump. Author: Michael Paquier Reviewed-by: Nathan Bossart, Dmitry Dolgov, Andres Freund, Álvaro Herrera Discussion: https://postgr.es/m/20180413034734.GE1552@paquier.xyz	2018-11-14 10:33:10 +09:00
Michael Paquier	b52b7dc250	Refactor code creating PartitionBoundInfo The code building PartitionBoundInfo based on the constituent partition data read from catalogs has been located in partcache.c, with a specific set of routines dedicated to bound types, like sorting or bound data creation. All this logic is moved to partbounds.c and relocates all the bound-specific logistic into it, with partition_bounds_create() as principal entry point. Author: Amit Langote Reviewed-by: Michael Paquier, Álvaro Herrera Discussion: https://postgr.es/m/3f289da8-6d10-75fe-814a-635e8b191d43@lab.ntt.co.jp	2018-11-14 10:01:49 +09:00
Tom Lane	965a3d6be0	Fix realfailN lexer rules to not make assumptions about input format. The realfail1 and realfail2 backup-prevention rules always returned token type FCONST, ignoring the possibility that what we've scanned is more appropriately described as ICONST. I think that at the time that code was added, it might actually have been safe to not distinguish; but since we started allowing AS-less aliases in SELECT target lists, it's definitely legal to have a number immediately followed by an identifier. In the SELECT case, it seems there's no visible consequence because make_const() will change the type back to integer anyway. But I'm worried that there are other contexts, or will be in future, where it's more important to get the constant's type right. Hence, use process_integer_literal to correctly determine which token type to return. Arguably this is a bug fix, but given the lack of evidence of user-visible problems, I'll refrain from back-patching. Discussion: https://postgr.es/m/21364.1542136808@sss.pgh.pa.us	2018-11-13 14:54:41 -05:00
Tom Lane	ec937d0805	Align ECPG lexer more closely with the core and psql lexers. Make a bunch of basically-cosmetic changes to reduce the diffs between the flex rules in scan.l, psqlscan.l, and pgc.l. Reorder some code, adjust a lot of whitespace, sync some comments, make use of flex start condition scopes to do that. There are a few non-cosmetic changes in the ECPG lexer: * Bring over the decimalfail rule (and support function process_integer_literal) so that ECPG will lex "1..10" into the same tokens as the backend would. I'm not sure this makes any visible difference to users, but I'm not sure it doesn't, either. * <xdc><<EOF>> gets its own rule so as to produce a more on-point error message. * Remove duplicate <SQL>{xdstart} rule. John Naylor, with a few additional changes by me Discussion: https://postgr.es/m/CAJVSVGWGqY9YBs2EwtRUkbNv=hXkN8yRPOoD1wxE6COgvvrz5g@mail.gmail.com	2018-11-13 12:57:52 -05:00
Thomas Munro	b59d4d6c36	Fix const correctness warning. Per buildfarm.	2018-11-13 19:03:02 +13:00
Amit Kapila	a53bc135fb	Fix the initialization of atomic variables introduced by the group clearing mechanism. Commits `0e141c0fbb` and `baaf272ac9` introduced initialization of atomic variables in InitProcess which means that it's not safe to look at those for backends that aren't currently in use. Fix that by initializing them during postmaster startup. Reported-by: Andres Freund Author: Amit Kapila Backpatch-through: 9.6 Discussion: https://postgr.es/m/20181027104138.qmbbelopvy7cw2qv@alap3.anarazel.de	2018-11-13 10:52:40 +05:30
Thomas Munro	257ef3cd4f	Fix handling of HBA ldapserver with multiple hostnames. Commit `35c0754f` failed to handle space-separated lists of alternative hostnames in ldapserver, when building a URI for ldap_initialize() (OpenLDAP). Such lists need to be expanded to space-separated URIs. Repair. Back-patch to 11, to fix bug report #15495. Author: Thomas Munro Reported-by: Renaud Navarro Discussion: https://postgr.es/m/15495-2c39fc196c95cd72%40postgresql.org	2018-11-13 17:46:28 +13:00
Thomas Munro	6a3dcd2856	Fix possible buffer overrun in hba.c. Coverty reports a possible buffer overrun in the code that populates the pg_hba_file_rules view. It may not be a live bug due to restrictions on options that can be used together, but let's increase MAX_HBA_OPTIONS and correct a nearby misleading comment. Back-patch to 10 where this code arrived. Reported-by: Julian Hsiao Discussion: https://postgr.es/m/CADnGQpzbkWdKS2YHNifwAvX5VEsJ5gW49U4o-7UL5pzyTv4vTg%40mail.gmail.com	2018-11-13 16:27:13 +13:00
Michael Paquier	52b70b1c7d	Remove CommandCounterIncrement() after processing ON COMMIT DELETE This comes from `f9b5b41`, which is part of one the original commits that implemented ON COMMIT actions. By looking at the truncation code, any CCI needed happens locally when rebuilding indexes, so it looks safe to just remove this final incrementation. Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/20181109024731.GF2652@paquier.xyz	2018-11-13 08:59:41 +09:00
Tom Lane	3de491482b	Simplify null-element handling in extension_config_remove(). There's no point in asking deconstruct_array() for a null-flags array when we already checked the array has no nulls, and aren't going to examine the output anyhow. Not asking for this output should make the code marginally faster, and it's also more robust since if there somehow were nulls, deconstruct_array() would throw an error. Daniel Gustafsson Discussion: https://postgr.es/m/289FFB8B-7AAB-48B5-A497-6E0D41D7BA47@yesql.se	2018-11-12 11:50:28 -05:00
Tom Lane	e3f005d974	Limit the number of index clauses considered in choose_bitmap_and(). classify_index_clause_usage() is O(N^2) in the number of distinct index qual clauses it considers, because of its use of a simple search list to store them. For nearly all queries, that's fine because only a few clauses will be considered. But Alexander Kuzmenkov reported a machine-generated query with 80000 (!) index qual clauses, which caused this code to take forever. Somewhat remarkably, this is the only O(N^2) behavior we now have for such a query, so let's fix it. We can get rid of the O(N^2) runtime for cases like this without much damage to the functionality of choose_bitmap_and() by separating out paths with "too many" qual or pred clauses, and deeming them to always be nonredundant with other paths. Then their clauses needn't go into the search list, so it doesn't get too long, but we don't lose the ability to consider bitmap AND plans altogether. I set the threshold for "too many" to be 100 clauses per path, which should be plenty to ensure no change in planning behavior for normal queries. There are other things we could do to make this go faster, but it's not clear that it's worth any additional effort. 80000 qual clauses require a whole lot of work in many other places, too. The code's been like this for a long time, so back-patch to all supported branches. The troublesome query only works back to 9.5 (in 9.4 it fails with stack overflow in the parser); so I'm not sure that fixing this in 9.4 has any real-world benefit, but perhaps it does. Discussion: https://postgr.es/m/90c5bdfa-d633-dabe-9889-3cf3e1acd443@postgrespro.ru	2018-11-12 11:19:04 -05:00
Andres Freund	450c7defa6	Remove volatiles from {procarray,volatile}.c and fix memory ordering issue. The use of volatiles in procarray.c largely originated from the time when postgres did not have reliable compiler and memory barriers. That's not the case anymore, so we can do better. Several of the functions in procarray.c can be bottlenecks, and removal of volatile yields mildly better code. The new state, with explicit memory barriers, is also more correct. The previous use of volatile did not actually deliver sufficient guarantees on weakly ordered machines, in particular the logic in GetNewTransactionId() does not look safe. It seems unlikely to be a problem in practice, but worth fixing. Thomas and I independently wrote a patch for this. Reported-By: Andres Freund and Thomas Munro Author: Andres Freund, with cherrypicked changes from a patch by Thomas Munro Discussion: https://postgr.es/m/20181005172955.wyjb4fzcdzqtaxjq@alap3.anarazel.de https://postgr.es/m/CAEepm=1nff0x=7i3YQO16jLA2qw-F9O39YmUew4oq-xcBQBs0g@mail.gmail.com	2018-11-10 16:11:57 -08:00
Peter Eisentraut	69ee2ff930	Apply RI trigger skipping tests also for DELETE The tests added in `cfa0f4255b` to skip firing an RI trigger if any old key value is NULL can also be applied for DELETE. This should give a performance gain in those cases, and it also saves a lot of duplicate code in the actual RI triggers. (That code was already dead code for the UPDATE cases.) Reviewed-by: Daniel Gustafsson <daniel@yesql.se>	2018-11-10 16:14:51 +01:00
Peter Eisentraut	34479d9a36	Remove dead foreign key optimization code The ri_KeysEqual() calls in the foreign-key trigger functions to optimize away some updates are useless because since `adfeef55cb` those triggers are not enqueued at all. (It's also not useful to keep these checks as some kind of backstop, since it's also semantically correct to just run the full check even with equal keys.) Reviewed-by: Daniel Gustafsson <daniel@yesql.se>	2018-11-10 16:14:51 +01:00
Andres Freund	5fde047f2b	Combine two flag tests in GetSnapshotData(). Previously the code checked PROC_IN_LOGICAL_DECODING and PROC_IN_VACUUM separately. As the relevant variable is marked as volatile, the compiler cannot combine the two tests. As GetSnapshotData() is pretty hot in a number of workloads, it's worthwhile to fix that. It'd also be a good idea to get rid of the volatiles altogether. But for one that's a larger patch, and for another, the code after this change still seems at least as easy to read as before. Author: Andres Freund Discussion: https://postgr.es/m/20181005172955.wyjb4fzcdzqtaxjq@alap3.anarazel.de	2018-11-09 20:43:56 -08:00
Tom Lane	fa2952d8eb	Fix missing role dependencies for some schema and type ACLs. This patch fixes several related cases in which pg_shdepend entries were never made, or were lost, for references to roles appearing in the ACLs of schemas and/or types. While that did no immediate harm, if a referenced role were later dropped, the drop would be allowed and would leave a dangling reference in the object's ACL. That still wasn't a big problem for normal database usage, but it would cause obscure failures in subsequent dump/reload or pg_upgrade attempts, taking the form of attempts to grant privileges to all-numeric role names. (I think I've seen field reports matching that symptom, but can't find any right now.) Several cases are fixed here: 1. ALTER DOMAIN SET/DROP DEFAULT would lose the dependencies for any existing ACL entries for the domain. This case is ancient, dating back as far as we've had pg_shdepend tracking at all. 2. If a default type privilege applies, CREATE TYPE recorded the ACL properly but forgot to install dependency entries for it. This dates to the addition of default privileges for types in 9.2. 3. If a default schema privilege applies, CREATE SCHEMA recorded the ACL properly but forgot to install dependency entries for it. This dates to the addition of default privileges for schemas in v10 (commit `ab89e465c`). Another somewhat-related problem is that when creating a relation rowtype or implicit array type, TypeCreate would apply any available default type privileges to that type, which we don't really want since such an object isn't supposed to have privileges of its own. (You can't, for example, drop such privileges once they've been added to an array type.) `ab89e465c` is also to blame for a race condition in the regression tests: privileges.sql transiently installed globally-applicable default privileges on schemas, which sometimes got absorbed into the ACLs of schemas created by concurrent test scripts. This should have resulted in failures when privileges.sql tried to drop the role holding such privileges; but thanks to the bug fixed here, it instead led to dangling ACLs in the final state of the regression database. We'd managed not to notice that, but it became obvious in the wake of commit `da906766c`, which allowed the race condition to occur in pg_upgrade tests. To fix, add a function recordDependencyOnNewAcl to encapsulate what callers of get_user_default_acl need to do; while the original call sites got that right via ad-hoc code, none of the later-added ones have. Also change GenerateTypeDependencies to generate these dependencies, which requires adding the typacl to its parameter list. (That might be annoying if there are any extensions calling that function directly; but if there are, they're most likely buggy in the same way as the core callers were, so they need work anyway.) While I was at it, I changed GenerateTypeDependencies to accept most of its parameters in the form of a Form_pg_type pointer, making its parameter list a bit less unwieldy and mistake-prone. The test race condition is fixed just by wrapping the addition and removal of default privileges into a single transaction, so that that state is never visible externally. We might eventually prefer to separate out tests of default privileges into a script that runs by itself, but that would be a bigger change and would make the tests run slower overall. Back-patch relevant parts to all supported branches. Discussion: https://postgr.es/m/15719.1541725287@sss.pgh.pa.us	2018-11-09 20:42:14 -05:00
Andres Freund	c670d0faac	Remove ineffective check against dropped columns from slot_getattr(). Before this commit slot_getattr() checked for dropped columns (returning NULL in that case), but only after checking for previously deformed columns. As slot_deform_tuple() does not contain such a check, the check in slot_getattr() would often not have been reached, depending on previous use of the slot. These days locking and plan invalidation ought to ensure that dropped columns are not accessed in query plans. Therefore this commit just drops the insufficient check in slot_getattr(). It's possible that we'll find some holes againt use of dropped columns, but if so, those need to be addressed independent of slot_getattr(), as most accesses don't go through that function anyway. Author: Andres Freund Discussion: https://postgr.es/m/20181107174403.zai7fedgcjoqx44p@alap3.anarazel.de	2018-11-09 17:40:40 -08:00
Andres Freund	1ef6bd2954	Don't require return slots for nodes without projection. In a lot of nodes the return slot is not required. That can either be because the node doesn't do any projection (say an Append node), or because the node does perform projections but the projection is optimized away because the projection would yield an identical row. Slots aren't that small, especially for wide rows, so it's worthwhile to avoid creating them. It's not possible to just skip creating the slot - it's currently used to determine the tuple descriptor returned by ExecGetResultType(). So separate the determination of the result type from the slot creation. The work previously done internally ExecInitResultTupleSlotTL() can now also be done separately with ExecInitResultTypeTL() and ExecInitResultSlot(). That way nodes that aren't guaranteed to need a result slot, can use ExecInitResultTypeTL() to determine the result type of the node, and ExecAssignScanProjectionInfo() (via ExecConditionalAssignProjectionInfo()) determines that a result slot is needed, it is created with ExecInitResultSlot(). Besides the advantage of avoiding to create slots that then are unused, this is necessary preparation for later patches around tuple table slot abstraction. In particular separating the return descriptor and slot is a prerequisite to allow JITing of tuple deforming with knowledge of the underlying tuple format, and to avoid unnecessarily creating JITed tuple deforming for virtual slots. This commit removes a redundant argument from ExecInitResultTupleSlotTL(). While this commit touches a lot of the relevant lines anyway, it'd normally still not worthwhile to cause breakage, except that aforementioned later commits will touch all ExecInitResultTupleSlotTL() callers anyway (but fits worse thematically). Author: Andres Freund Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de	2018-11-09 17:19:39 -08:00
Michael Paquier	319a810180	Fix dependency handling of partitions and inheritance for ON COMMIT This commit fixes a set of issues with ON COMMIT actions when used on partitioned tables and tables with inheritance children: - Applying ON COMMIT DROP on a partitioned table with partitions or on a table with inheritance children caused a failure at commit time, with complains about the children being already dropped as all relations are dropped one at the same time. - Applying ON COMMIT DELETE on a partition relying on a partitioned table which uses ON COMMIT DROP would cause the partition truncation to fail as the parent is removed first. The solution to the first problem is to handle the removal of all the dependencies in one go instead of dropping relations one-by-one, based on a suggestion from Álvaro Herrera. So instead all the relation OIDs to remove are gathered and then processed in one round of multiple deletions. The solution to the second problem is to reorder the actions, with truncation happening first and relation drop done after. Even if it means that a partition could be first truncated, then immediately dropped if its partitioned table is dropped, this has the merit to keep the code simple as there is no need to do existence checks on the relations to drop. Contrary to a manual TRUNCATE on a partitioned table, ON COMMIT DELETE does not cascade to its partitions. The ON COMMIT action defined on each partition gets the priority. Author: Michael Paquier Reviewed-by: Amit Langote, Álvaro Herrera, Robert Haas Discussion: https://postgr.es/m/68f17907-ec98-1192-f99f-8011400517f5@lab.ntt.co.jp Backpatch-through: 10	2018-11-09 10:03:22 +09:00
Tom Lane	3d360e20c9	Disallow setting client_min_messages higher than ERROR. Previously it was possible to set client_min_messages to FATAL or PANIC, which had the effect of suppressing transmission of regular ERROR messages to the client. Perhaps that seemed like a useful option in the past, but the trouble with it is that it breaks guarantees that are explicitly made in our FE/BE protocol spec about how a query cycle can end. While libpq and psql manage to cope with the omission, that's mostly because they are not very bright; client libraries that have more semantic knowledge are likely to get confused. Notably, pgODBC doesn't behave very sanely. Let's fix this by getting rid of the ability to set client_min_messages above ERROR. In HEAD, just remove the FATAL and PANIC options from the set of allowed enum values for client_min_messages. (This change also affects trace_recovery_messages, but that's OK since these aren't useful values for that variable either.) In the back branches, there was concern that rejecting these values might break applications that are explicitly setting things that way. I'm pretty skeptical of that argument, but accommodate it by accepting these values and then internally setting the variable to ERROR anyway. In all branches, this allows a couple of tiny simplifications in the logic in elog.c, so do that. Also respond to the point that was made that client_min_messages has exactly nothing to do with the server's logging behavior, and therefore does not belong in the "When To Log" subsection of the documentation. The "Statement Behavior" subsection is a better match, so move it there. Jonah Harris and Tom Lane Discussion: https://postgr.es/m/7809.1541521180@sss.pgh.pa.us Discussion: https://postgr.es/m/15479-ef0f4cc2fd995ca2@postgresql.org	2018-11-08 17:33:43 -05:00
Alvaro Herrera	705d433fd5	Revise attribute handling code on partition creation The original code to propagate NOT NULL and default expressions specified when creating a partition was mostly copy-pasted from typed-tables creation, but not being a great match it contained some duplicity, inefficiency and bugs. This commit fixes the bug that NOT NULL constraints declared in the parent table would not be honored in the partition. One reported issue that is not fixed is that a DEFAULT declared in the child is not used when inserting through the parent. That would amount to a behavioral change that's better not back-patched. This rewrite makes the code simpler: 1. instead of checking for duplicate column names in its own block, reuse the original one that already did that; 2. instead of concatenating the list of columns from parent and the one declared in the partition and scanning the result to (incorrectly) propagate defaults and not-null constraints, just scan the latter searching the former for a match, and merging sensibly. This works because we know the list in the parent is already correct and there can only be one parent. This rewrite makes ColumnDef->is_from_parent unused, so it's removed on branch master; on released branches, it's kept as an unused field in order not to cause ABI incompatibilities. This commit also adds a test case for creating partitions with collations mismatching that on the parent table, something that is closely related to the code being patched. No code change is introduced though, since that'd be a behavior change that could break some (broken) working applications. Amit Langote wrote a less invasive fix for the original NOT NULL/defaults bug, but while I kept the tests he added, I ended up not using his original code. Ashutosh Bapat reviewed Amit's fix. Amit reviewed mine. Author: Álvaro Herrera, Amit Langote Reviewed-by: Ashutosh Bapat, Amit Langote Reported-by: Jürgen Strobel (bug #15212) Discussion: https://postgr.es/m/152746742177.1291.9847032632907407358@wrigleys.postgresql.org	2018-11-08 16:22:09 -03:00
Michael Paquier	170dccc69d	Fix incorrect routine name reference in partprune.c Author: Yuzuko Hosoya Discussion: https://postgr.es/m/00ac01d4774c$7feac860$7fc05920$@lab.ntt.co.jp	2018-11-08 20:14:16 +09:00
Andres Freund	c14a8ff27e	Fixup for `b84a6dafbf` triggering assert failure in LLVM debug builds. Author: Andres Freund	2018-11-07 14:00:14 -08:00
Andres Freund	b84a6dafbf	Move EEOP_*_SYSVAR evaluation out of line. This mainly de-duplicates code. As evaluating a system variable isn't the hottest path and the current inline implementation ends up calling out to an external function anyway, this is OK from a performance POV. The main motivation for de-duplicating is the upcoming slot abstraction work, after which there's not guaranteed to be a HeapTuple backing the slot. Author: Andres Freund, Amit Khandekar Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de	2018-11-07 11:08:45 -08:00
Andres Freund	5f32b29c18	Build HashState's hashkeys expression with the correct parent. Previously the expressions were built with the HashJoinState as a parent. That's incorrect. Currently this does not appear to be harmful, but for the upcoming 'slot abstraction' work this proves to be problematic, as the underlying slot types can differ between Hash and HashJoin. It's possible that this already causes a problem, but I've not been able to come up with a scenario. Therefore don't backpatch at this point. Author: Andres Freund Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de	2018-11-07 09:25:54 -08:00
Tom Lane	c6e4133fae	Postpone calculating total_table_pages until after pruning/exclusion. The planner doesn't make any use of root->total_table_pages until it estimates costs of indexscans, so we don't need to compute it as early as that's currently done. By doing the calculation between set_base_rel_sizes and set_base_rel_pathlists, we can omit relations that get removed from the query by partition pruning or constraint exclusion, which seems like a more accurate basis for costing. (Historical note: I think at the time this code was written, there was not a separation between the "set sizes" and "set pathlists" steps, so that this approach would have been impossible at the time. But now that we have that separation, this is clearly the better way to do things.) David Rowley, reviewed by Edmund Horner Discussion: https://postgr.es/m/CAKJS1f-NG1mRM0VOtkAG7=ZLQWihoqees9R4ki3CKBB0-fRfCA@mail.gmail.com	2018-11-07 12:12:56 -05:00
Tom Lane	5d28c9bd73	Disable recheck_on_update optimization to avoid crashes. The code added by commit `c203d6cf8` causes a crash in at least one case, where a potentially-optimizable expression index has a storage type different from the input data type. A cursory code review turned up numerous other problems that seem impractical to fix on short notice. Andres argued for revert of that patch some time ago, and if additional senior committers had been paying attention, that's likely what would have happened, but we were not :-( At this point we can't just revert, at least not in v11, because that would mean an ABI break for code touching relcache entries. And we should not remove the (also buggy) support for the recheck_on_update index reloption, since it might already be used in some databases in the field. So this patch just does the as-little-invasive-as-possible measure of disabling the feature as though recheck_on_update were forced off for all indexes. I also removed the related regression tests (which would otherwise fail) and the user-facing documentation of the reloption. We should undertake a more thorough code cleanup if the patch can't be fixed, but not under the extreme time pressure of being already overdue for 11.1 release. Per report from Ondřej Bouda and subsequent private discussion among pgsql-release. Discussion: https://postgr.es/m/20181106185255.776mstcyehnc63ty@alvherre.pgsql	2018-11-06 18:33:28 -05:00
Thomas Munro	c4f0876fb8	Remove set-but-unused variable. Clean-up for commit `c24dcd0c`. Reported-by: Andrew Dunstan Discussion: https://postgr.es/m/2d52ff4a-5440-f6f1-7806-423b0e6370cb%402ndQuadrant.com	2018-11-07 12:06:43 +13:00
Andrew Gierth	5613da4cc7	Optimize nested ConvertRowtypeExpr nodes. A ConvertRowtypeExpr is used to translate a whole-row reference of a child to that of a parent. The planner produces nested ConvertRowtypeExpr while translating whole-row reference of a leaf partition in a multi-level partition hierarchy. Executor then translates the whole-row reference from the leaf partition into all the intermediate parent's whole-row references before arriving at the final whole-row reference. It could instead translate the whole-row reference from the leaf partition directly to the top-most parent's whole-row reference skipping any intermediate translations. Ashutosh Bapat, with tests by Kyotaro Horiguchi and some editorialization by me. Reviewed by Andres Freund, Pavel Stehule, Kyotaro Horiguchi, Dmitry Dolgov, Tom Lane.	2018-11-06 21:10:10 +00:00
Thomas Munro	c24dcd0cfd	Use pg_pread() and pg_pwrite() for data files and WAL. Cut down on system calls by doing random I/O using offset-based OS routines where available. Remove the code for tracking the 'virtual' seek position. The only reason left to call FileSeek() was to get the file's size, so provide a new function FileSize() instead. Author: Oskari Saarenmaa, Thomas Munro Reviewed-by: Thomas Munro, Jesper Pedersen, Tom Lane, Alvaro Herrera Discussion: https://postgr.es/m/CAEepm=02rapCpPR3ZGF2vW=SBHSdFYO_bz_f-wwWJonmA3APgw@mail.gmail.com Discussion: https://postgr.es/m/b8748d39-0b19-0514-a1b9-4e5a28e6a208%40gmail.com Discussion: https://postgr.es/m/a86bd200-ebbe-d829-e3ca-0c4474b2fcb7%40ohmu.fi	2018-11-07 09:51:50 +13:00
Bruce Momjian	b43df566b3	GUC: adjust effective_cache_size SQL descriptions Follow on patch for commit `3e0f1a4741`. Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/369ec766-b947-51bd-4dad-6fb9e026439f@2ndquadrant.com Backpatch-through: 9.4	2018-11-06 13:40:03 -05:00
Tom Lane	003c68a3b4	Rename rbtree.c functions to use "rbt" prefix not "rb" prefix. The "rb" prefix is used by Ruby, so that our existing code results in name collisions that break plruby. We discussed ways to prevent that by adjusting dynamic linker options, but it seems that at best we'd move the pain to other cases. Renaming to avoid the collision is the only portable fix anyway. Fortunately, our rbtree code is not (yet?) widely used --- in core, there's only a single usage in GIN --- so it seems likely that we can get away with a rename. I chose to do this basically as s/rb/rbt/g, except for places where there already was a "t" after "rb". The patch could have been made smaller by only touching linker-visible symbols, but it would have resulted in oddly inconsistent-looking code. Better to make it look like "rbt" was the plan all along. Back-patch to v10. The rbtree.c code exists back to 9.5, but rb_iterate() which is the actual immediate source of pain was added in v10, so it seems like changing the names before that would have more risk than benefit. Per report from Pavel Raiskup. Discussion: https://postgr.es/m/4738198.8KVIIDhgEB@nb.usersys.redhat.com	2018-11-06 13:25:24 -05:00
Thomas Munro	9e12fb02b7	Remove some remaining traces of dsm_resize(). A couple of obsolete comments and unreachable blocks remained after commit `3c60d0fa`. Discussion: https://postgr.es/m/CAA4eK1%2B%3DyAFUvpFoHXFi_gm8YqmXN-TtkFH%2BVYjvDLS6-SFq-Q%40mail.gmail.com	2018-11-06 21:40:08 +13:00
Michael Paquier	8f045e242b	Switch pg_promote to be parallel-safe pg_promote uses nothing relying on a global state, so it is fine to mark it as parallel-safe, conclusion based on a detailed analysis from Robert Haas. This also fixes an inconsistency where pg_proc.dat missed to mark the function with its previous value for proparallel, update which does not matter now as the default is used. Based on a discussion between multiple folks: Laurenz Albe, Robert Haas, Amit Kapila, Tom Lane and myself. Discussion: https://postgr.es/m/20181029082530.GL14242@paquier.xyz	2018-11-06 14:11:21 +09:00
Thomas Munro	3c60d0fa23	Remove dsm_resize() and dsm_remap(). These interfaces were never used in core, didn't handle failure of posix_fallocate() correctly and weren't supported on all platforms. We agreed to remove them in 12. Author: Thomas Munro Reported-by: Andres Freund Discussion: https://postgr.es/m/CAA4eK1%2B%3DyAFUvpFoHXFi_gm8YqmXN-TtkFH%2BVYjvDLS6-SFq-Q%40mail.gmail.com	2018-11-06 16:11:12 +13:00
Andres Freund	a3fb382e9c	Fix copy-paste error in errhint() introduced in `691d79a079`. Reported-By: Petr Jelinek Discussion: https://postgr.es/m/c95a620b-34f0-7930-aeb5-f7ab804f26cb@2ndquadrant.com Backpatch: 9.4-, like the previous commit	2018-11-05 12:05:38 -08:00

1 2 3 4 5 ...

18834 Commits