postgresql

Commit Graph

Author	SHA1	Message	Date
Magnus Hagander	55d57359f2	Fix typos in comments and debug message Antonin Houska	2016-07-18 18:46:57 +02:00
Peter Eisentraut	7d67606569	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 3d71988dffd3c0798a8864c55ca4b7833b48abb1	2016-07-18 12:07:49 -04:00
Andres Freund	eca0f1db14	Clear all-frozen visibilitymap status when locking tuples. Since `a892234` & `fd31cd265` the visibilitymap's freeze bit is used to avoid vacuuming the whole relation in anti-wraparound vacuums. Doing so correctly relies on not adding xids to the heap without also unsetting the visibilitymap flag. Tuple locking related code has not done so. To allow selectively resetting all-frozen - to avoid pessimizing heap_lock_tuple - allow to selectively reset the all-frozen with visibilitymap_clear(). To avoid having to use visibilitymap_get_status (e.g. via VM_ALL_FROZEN) inside a critical section, have visibilitymap_clear() return whether any bits have been reset. There's a remaining issue (denoted by XXX): After the PageIsAllVisible() check in heap_lock_tuple() and heap_lock_updated_tuple_rec() the page status could theoretically change. Practically that currently seems impossible, because updaters will hold a page level pin already. Due to the next beta coming up, it seems better to get the required WAL magic bump done before resolving this issue. The added flags field fields to xl_heap_lock and xl_heap_lock_updated require bumping the WAL magic. Since there's already been a catversion bump since the last beta, that's not an issue. Reviewed-By: Robert Haas, Amit Kapila and Andres Freund Author: Masahiko Sawada, heavily revised by Andres Freund Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com Backpatch: -	2016-07-18 02:01:13 -07:00
Tom Lane	65632082b7	Remove obsolete comment. Peter Geoghegan	2016-07-17 19:18:19 -04:00
Tom Lane	9563d5b5e4	Add regression test case exercising the sorting path for hash index build. We've broken this code path at least twice in the past, so it's prudent to have a test case that covers it. To allow exercising the code path without creating a very large (and slow to run) test case, redefine the sort threshold to be bounded by maintenance_work_mem as well as the number of available buffers. While at it, fix an ancient oversight that when building a temp index, the number of available buffers is not NBuffers but NLocBuffer. Also, if assertions are enabled, apply a direct test that the sort actually does return the tuples in the expected order. Peter Geoghegan Patch: <CAM3SWZTBAo4hjbBd780+MrOKiKp_TMo1N3A0Rw9_im8gbD7fQA@mail.gmail.com>	2016-07-16 15:30:15 -04:00
Tom Lane	278148907a	Fix crash in close_ps() for NaN input coordinates. The Assert() here seems unreasonably optimistic. Andreas Seltenreich found that it could fail with NaNs in the input geometries, and it seems likely to me that it might fail in corner cases due to roundoff error, even for ordinary input values. As a band-aid, make the function return SQL NULL instead of crashing. Report: <87d1md1xji.fsf@credativ.de>	2016-07-16 14:42:37 -04:00
Andres Freund	bfa2ab56bb	Fix torn-page, unlogged xid and further risks from heap_update(). When heap_update needs to look for a page for the new tuple version, because the current one doesn't have sufficient free space, or when columns have to be processed by the tuple toaster, it has to release the lock on the old page during that. Otherwise there'd be lock ordering and lock nesting issues. To avoid concurrent sessions from trying to update / delete / lock the tuple while the page's content lock is released, the tuple's xmax is set to the current session's xid. That unfortunately was done without any WAL logging, thereby violating the rule that no XIDs may appear on disk, without an according WAL record. If the database were to crash / fail over when the page level lock is released, and some activity lead to the page being written out to disk, the xid could end up being reused; potentially leading to the row becoming invisible. There might be additional risks by not having t_ctid point at the tuple itself, without having set the appropriate lock infomask fields. To fix, compute the appropriate xmax/infomask combination for locking the tuple, and perform WAL logging using the existing XLOG_HEAP_LOCK record. That allows the fix to be backpatched. This issue has existed for a long time. There appears to have been partial attempts at preventing dangers, but these never have fully been implemented, and were removed a long time ago, in `11919160` (cf. HEAP_XMAX_UNLOGGED). In master / 9.6, there's an additional issue, namely that the visibilitymap's freeze bit isn't reset at that point yet. Since that's a new issue, introduced only in `a892234f83`, that'll be fixed in a separate commit. Author: Masahiko Sawada and Andres Freund Reported-By: Different aspects by Thomas Munro, Noah Misch, and others Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com Backpatch: 9.1/all supported versions	2016-07-15 17:49:48 -07:00
Andres Freund	a4d357bfbd	Make HEAP_LOCK/HEAP2_LOCK_UPDATED replay reset HEAP_XMAX_INVALID. `0ac5ad5` started to compress infomask bits in WAL records. Unfortunately the replay routines for XLOG_HEAP_LOCK/XLOG_HEAP2_LOCK_UPDATED forgot to reset the HEAP_XMAX_INVALID (and some other) hint bits. Luckily that's not problematic in the majority of cases, because after a crash/on a standby row locks aren't meaningful. Unfortunately that does not hold true in the presence of prepared transactions. This means that after a crash, or after promotion, row level locks held by a prepared, but not yet committed, prepared transaction might not be enforced. Discussion: 20160715192319.ubfuzim4zv3rqnxv@alap3.anarazel.de Backpatch: 9.3, the oldest branch on which `0ac5ad5` is present.	2016-07-15 14:45:37 -07:00
Tom Lane	45639a0525	Avoid invalidating all foreign-join cached plans when user mappings change. We must not push down a foreign join when the foreign tables involved should be accessed under different user mappings. Previously we tried to enforce that rule literally during planning, but that meant that the resulting plans were dependent on the current contents of the pg_user_mapping catalog, and we had to blow away all cached plans containing any remote join when anything at all changed in pg_user_mapping. This could have been improved somewhat, but the fact that a syscache inval callback has very limited info about what changed made it hard to do better within that design. Instead, let's change the planner to not consider user mappings per se, but to allow a foreign join if both RTEs have the same checkAsUser value. If they do, then they necessarily will use the same user mapping at runtime, and we don't need to know specifically which one that is. Post-plan-time changes in pg_user_mapping no longer require any plan invalidation. This rule does give up some optimization ability, to wit where two foreign table references come from views with different owners or one's from a view and one's directly in the query, but nonetheless the same user mapping would have applied. We'll sacrifice the first case, but to not regress more than we have to in the second case, allow a foreign join involving both zero and nonzero checkAsUser values if the nonzero one is the same as the prevailing effective userID. In that case, mark the plan as only runnable by that userID. The plancache code already had a notion of plans being userID-specific, in order to support RLS. It was a little confused though, in particular lacking clarity of thought as to whether it was the rewritten query or just the finished plan that's dependent on the userID. Rearrange that code so that it's clearer what depends on which, and so that the same logic applies to both RLS-injected role dependency and foreign-join-injected role dependency. Note that this patch doesn't remove the other issue mentioned in the original complaint, which is that while we'll reliably stop using a foreign join if it's disallowed in a new context, we might fail to start using a foreign join if it's now allowed, but we previously created a generic cached plan that didn't use one. It was agreed that the chance of winning that way was not high enough to justify the much larger number of plan invalidations that would have to occur if we tried to cause it to happen. In passing, clean up randomly-varying spelling of EXPLAIN commands in postgres_fdw.sql, and fix a COSTS ON example that had been allowed to leak into the committed tests. This reverts most of commits `fbe5a3fb7` and `5d4171d1c`, which were the previous attempt at ensuring we wouldn't push down foreign joins that span permissions contexts. Etsuro Fujita and Tom Lane Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>	2016-07-15 17:23:02 -04:00
Alvaro Herrera	533e9c6b06	Avoid serializability errors when locking a tuple with a committed update When key-share locking a tuple that has been not-key-updated, and the update is a committed transaction, in some cases we raised serializability errors: ERROR: could not serialize access due to concurrent update Because the key-share doesn't conflict with the update, the error is unnecessary and inconsistent with the case that the update hasn't committed yet. This causes problems for some usage patterns, even if it can be claimed that it's sufficient to retry the aborted transaction: given a steady stream of updating transactions and a long locking transaction, the long transaction can be starved indefinitely despite multiple retries. To fix, we recognize that HeapTupleSatisfiesUpdate can return HeapTupleUpdated when an updating transaction has committed, and that we need to deal with that case exactly as if it were a non-committed update: verify whether the two operations conflict, and if not, carry on normally. If they do conflict, however, there is a difference: in the HeapTupleBeingUpdated case we can just sleep until the concurrent transaction is gone, while in the HeapTupleUpdated case this is not possible and we must raise an error instead. Per trouble report from Olivier Dony. In addition to a couple of test cases that verify the changed behavior, I added a test case to verify the behavior that remains unchanged, namely that errors are raised when a update that modifies the key is used. That must still generate serializability errors. One pre-existing test case changes behavior; per discussion, the new behavior is actually the desired one. Discussion: https://www.postgresql.org/message-id/560AA479.4080807@odoo.com https://www.postgresql.org/message-id/20151014164844.3019.25750@wrigleys.postgresql.org Backpatch to 9.3, where the problem appeared.	2016-07-15 14:17:20 -04:00
Teodor Sigaev	00f304ce2d	Fix parsing NOT sequence in tsquery Digging around bug #14245 I found that commit `6734a1cacd` missed that NOT operation is right associative in opposite to all other. This miss is resposible for tsquery parser fail on sequence of NOT operations	2016-07-15 20:01:41 +03:00
Teodor Sigaev	19d290155d	Fix nested NOT operation cleanup in tsquery. During normalization of tsquery tree it tries to simplify nested NOT operations but there it's obvioulsy missed that subsequent node could be a leaf node (value node) Bug #14245: Segfault on weird to_tsquery Reported by David Kellum.	2016-07-15 19:22:18 +03:00
Peter Eisentraut	63cfdb8dde	Adjust spellings of forms of "cancel"	2016-07-14 22:48:26 -04:00
Tom Lane	1acf757255	Fix GiST index build for NaN values in geometric types. GiST index build could go into an infinite loop when presented with boxes (or points, circles or polygons) containing NaN component values. This happened essentially because the code assumed that x == x is true for any "double" value x; but it's not true for NaNs. The looping behavior was not the only problem though: we also attempted to sort the items using simple double comparisons. Since NaNs violate the trichotomy law, qsort could (in principle at least) get arbitrarily confused and mess up the sorting of ordinary values as well as NaNs. And we based splitting choices on box size calculations that could produce NaNs, again resulting in undesirable behavior. To fix, replace all comparisons of doubles in this logic with float8_cmp_internal, which is NaN-aware and is careful to sort NaNs consistently, higher than any non-NaN. Also rearrange the box size calculation to not produce NaNs; instead it should produce an infinity for a box with NaN on one side and not-NaN on the other. I don't by any means claim that this solves all problems with NaNs in geometric values, but it should at least make GiST index insertion work reliably with such data. It's likely that the index search side of things still needs some work, and probably regular geometric operations too. But with this patch we're laying down a convention for how such cases ought to behave. Per bug #14238 from Guang-Dih Lei. Back-patch to 9.2; the code used before commit `7f3bd86843` is quite different and doesn't lock up on my simple test case, nor on the submitter's dataset. Report: <20160708151747.1426.60150@wrigleys.postgresql.org> Discussion: <28685.1468246504@sss.pgh.pa.us>	2016-07-14 18:45:59 -04:00
Tom Lane	cec5501394	Add a regression test case to improve code coverage for tuplesort. Test the external-sort code path in CLUSTER for two different scenarios: multiple-pass external sorting, and the best case for replacement selection, where only one run is produced, so that no merge is required. This test would have caught the bug fixed in commit `1b0fc8507`, at least when run with valgrind enabled. In passing, add a short-circuit test in plan_cluster_use_sort() to make dead certain that it selects sorting when enable_indexscan is off. As things stand, that would happen anyway, but it seems like good future proofing for this test. Peter Geoghegan Discussion: <CAM3SWZSgxehDkDMq1FdiW2A0Dxc79wH0hz1x-TnGy=1BXEL+nw@mail.gmail.com>	2016-07-13 15:23:56 -04:00
Peter Eisentraut	5d4050064b	Add serial comma and quoting to message	2016-07-12 18:37:34 -04:00
Tom Lane	4d042999f9	Print a given subplan only once in EXPLAIN. We have, for a very long time, allowed the same subplan (same member of the PlannedStmt.subplans list) to be referenced by more than one SubPlan node; this avoids problems for cases such as subplans within an IndexScan's indxqual and indxqualorig fields. However, EXPLAIN had not gotten the memo and would print each reference as though it were an independent identical subplan. To fix, track plan_ids of subplans we've printed and don't print the same plan_id twice. Per report from Pavel Stehule. BTW: the particular case of IndexScan didn't cause visible duplication in a plain EXPLAIN, only EXPLAIN ANALYZE, because in the former case we short-circuit executor startup before the indxqual field is processed by ExecInitExpr. That seems like it could easily lead to other EXPLAIN problems in future, but it's not clear how to avoid it without breaking the "EXPLAIN a plan using hypothetical indexes" use-case. For now I've left that issue alone. Although this is a longstanding bug, it's purely cosmetic (no great harm is done by the repeat printout) and we haven't had field complaints before. So I'm hesitant to back-patch it, especially since there is some small risk of ABI problems due to the need to add a new field to ExplainState. In passing, rearrange order of fields in ExplainState to be less random, and update some obsolete comments about when/where to initialize them. Report: <CAFj8pRAimq+NK-menjt+3J4-LFoodDD8Or6=Lc_stcFD+eD4DA@mail.gmail.com>	2016-07-11 18:14:29 -04:00
Magnus Hagander	87d84d67bb	Fix start WAL filename for concurrent backups from standby On a standby, ThisTimelineID is always 0, so we would generate a filename in timeline 0 even for other timelines. Instead, use starttli which we have retreived from the controlfile. Report by: Francesco Canovai in bug #14230 Author: Marco Nenciarini Reviewed by: Michael Paquier and Amit Kapila	2016-07-11 12:02:31 +02:00
Tom Lane	96112ee7c6	Revert "Add some temporary code to record stack usage at server process exit." This reverts commit `88cf37d2a8` as well as follow-on commits `ea9c4a16d5` and `c57562725d`. We've learned about as much as we can from the buildfarm.	2016-07-10 12:44:20 -04:00
Tom Lane	c57562725d	Improve recording of IA64 stack data. Examination of the results from anole and gharial suggests that we're only managing to track the size of one of the two stacks of IA64 machines. Some googling gave the answer: on HPUX11, the register stack is reported as a page type I don't see in pstat.h on my HPUX10 box. Let's try testing for that.	2016-07-09 15:00:22 -04:00
Tom Lane	ea9c4a16d5	Add more temporary code to record stack usage at server process exit. After a look at preliminary results from commit `88cf37d2a8`, I realized it'd be a good idea to spew out the maximum depth measurement seen by check_stack_depth. So add some quick-n-dirty code to do that. Like the previous commit, this will be reverted once we've gathered a set of buildfarm runs with it.	2016-07-08 16:38:22 -04:00
Tom Lane	88cf37d2a8	Add some temporary code to record stack usage at server process exit. This patch is meant to gather information from the buildfarm members, and will be reverted in a day or so. The idea is to try to find out the high-water stack consumption while running the regression tests, particularly on IA64 which is suspected to use much more stack than other architectures. On machines with pmap, we can use that; but the IA64 farm members are running HPUX, so also include some bespoke code for HPUX. (I've tested the latter on HPUX 10/HPPA; not entirely sure it will work on HPUX 11/IA64, but we'll soon find out.) Discussion: <CAM-w4HMwwcwaVvYcAH0_FGtG5GeXdYVRfvG81pXnSJWHnCfosQ@mail.gmail.com>	2016-07-08 12:01:08 -04:00
Robert Haas	3edcdbcc4d	Fix typo in comment. Amit Langote	2016-07-07 16:13:37 -04:00
Robert Haas	1b0fc85077	Properly adjust pointers when tuples are moved during CLUSTER. Otherwise, when we abandon incremental memory accounting and use batch allocation for the final merge pass, we might crash. This has been broken since `0011c0091e`. Peter Geoghegan, tested by Noah Misch	2016-07-07 13:47:16 -04:00
Robert Haas	b22934dc03	Fix a prototype which is inconsistent with the function definition. Peter Geoghegan	2016-07-07 13:46:51 -04:00
Robert Haas	d1f822e585	Clarify resource utilization of parallel query. temp_file_limit is a per-process limit, not a per-session limit across all cooperating parallel processes; change wording accordingly, per a suggestion from Tom Lane. Also, document under max_parallel_workers_per_gather the fact that each process involved in a parallel query may use as many resources as a separate session. Caveat emptor. Per a complaint from Peter Geoghegan.	2016-07-07 11:35:08 -04:00
Fujii Masao	60d50769b7	Rename pg_stat_wal_receiver.conn_info to conninfo. Per discussion on pgsql-hackers, conninfo is better as the column name because it's more commonly used in PostgreSQL. Catalog version bumped due to the change of pg_proc. Author: Michael Paquier	2016-07-07 12:59:39 +09:00
Peter Eisentraut	397bf6eed8	Fix typos	2016-07-06 21:18:03 -04:00
Fujii Masao	1109164913	Fix typo in comment. Author: Masahiko Sawada	2016-07-06 18:59:17 +09:00
Tom Lane	9c810a2edc	Fix failure to handle conflicts in non-arbiter exclusion constraints. ExecInsertIndexTuples treated an exclusion constraint as subject to noDupErr processing even when it was not listed in arbiterIndexes, and would therefore not error out for a conflict in such a constraint, instead returning it as an arbiter-index failure. That led to an infinite loop in ExecInsert, since ExecCheckIndexConstraints ignored the index as-intended and therefore didn't throw the expected error. To fix, make the exclusion constraint code path use the same condition as the index_insert call does to decide whether no-error-for-duplicates behavior is appropriate. While at it, refactor a little bit to avoid unnecessary list_member_oid calls. (That surely wouldn't save anything worth noticing, but I find the code a bit clearer this way.) Per bug report from Heikki Rauhala. Back-patch to 9.5 where ON CONFLICT was introduced. Report: <4C976D6B-76B4-434C-8052-D009F7B7AEDA@reaktor.fi>	2016-07-04 16:09:11 -04:00
Tom Lane	29a2195de6	Typo fix.	2016-07-03 18:43:43 -04:00
Tom Lane	110a6dbdeb	Allow RTE_SUBQUERY rels to be considered parallel-safe. There isn't really any reason not to; the original comments here were partly confused about subplans versus subquery-in-FROM, and partly dependent on restrictions that no longer apply now that subqueries return Paths not Plans. Depending on what's inside the subquery, it might fail to produce any parallel_safe Paths, but that's fine. Tom Lane and Robert Haas	2016-07-03 18:24:49 -04:00
Tom Lane	4ea9948e58	Fix up parallel-safety marking for appendrels. The previous coding assumed that the value derived by set_rel_consider_parallel() for an appendrel parent would be accurate for all the appendrel's children; but this is not so, for example because one child might scan a temp table. Instead, apply set_rel_consider_parallel() to each child rel as well as the parent, and then take the AND of the results as controlling parallel safety for the appendrel as a whole. (We might someday be able to deal more intelligently than this with cases in which some of the childrels are parallel-safe and others not, but that's for later.) Robert Haas and Tom Lane	2016-07-03 17:57:28 -04:00
Tom Lane	2c6e6471af	Allow treating TABLESAMPLE scans as parallel-safe. This was the intention all along, but an extraneous "return;" in set_rel_consider_parallel() caused sampled rels to never be marked consider_parallel. Since we don't have any partial tablesample path/plan type yet, there's no possibility of parallelizing the sample scan itself; but this fix allows such a scan to appear below a parallel join, for example.	2016-07-03 16:55:27 -04:00
Tom Lane	0e495c5e2f	Set correct cost data in Gather node added by force_parallel_mode. We were just leaving the cost fields zeroes, which produces obviously bogus output with force_parallel_mode = on. With force_parallel_mode = regress, the zeroes are hidden, but I wonder if they wouldn't still confuse add-on code such as auto_explain.	2016-07-03 15:35:29 -04:00
Tom Lane	c89d507649	Round rowcount estimate for a partial path to an integer. I'd been wondering why I was sometimes seeing fractional rowcount estimates in parallel-query situations, and this seems to be the reason. (You won't see the fractional parts in EXPLAIN, because it prints rowcounts with %.0f, but they are apparent in the debugger.) A fractional rowcount is not any saner for a partial path than any other kind of path, and it's equally likely to break cost estimation for higher paths, so apply clamp_row_est() like we do in other places.	2016-07-03 14:53:46 -04:00
Tom Lane	420c166163	Fix failure to mark all aggregates with appropriate transtype. In commit `915b703e1` I gave get_agg_clause_costs() the responsibility of marking Aggref nodes with the appropriate aggtranstype. I failed to notice that where it was being called from, it might see only a subset of the Aggref nodes that were in the original targetlist. Specifically, if there are duplicate aggregate calls in the tlist, either make_sort_input_target or make_window_input_target might put just a single instance into the grouping_target, and then only that instance would get marked. Fix by moving the call back into grouping_planner(), before we start building assorted PathTargets from the query tlist. Per report from Stefan Huehner. Report: <20160702131056.GD3165@huehner.biz>	2016-07-02 13:23:12 -04:00
Tom Lane	7b67a0a49c	Fix some interrelated planner issues with initPlans and Param munging. In commit `68fa28f77` I tried to teach SS_finalize_plan() to cope with initPlans attached anywhere in the plan tree, by dint of moving its handling of those into the recursion in finalize_plan(). It turns out that that doesn't really work: if a lower-level plan node emits an initPlan output parameter in its targetlist, it's legitimate for upper levels to reference those Params --- and at the point where this code runs, those references look just like the Param itself, so finalize_plan() quite properly rejects them as being in the wrong place. We could lobotomize the checks enough to allow that, probably, but then it's not clear that we'd have any meaningful check for misplaced Params at all. What seems better, at least in the near term, is to tweak standard_planner() a bit so that initPlans are never placed anywhere but the topmost plan node for a query level, restoring the behavior that occurred pre-9.6. Possibly we can do better if this code is ever merged into setrefs.c: then it would be possible to check a Param's placement only when we'd failed to replace it with a Var referencing a child plan node's targetlist. BTW, I'm now suspicious that finalize_plan is doing the wrong thing by returning the node's allParam rather than extParam to be incorporated in the parent node's set of used parameters. However, it makes no difference given that initPlans only appear at top level, so I'll leave that alone for now. Another thing that emerged from this is that standard_planner() needs to check for initPlans before deciding that it's safe to stick a Gather node on top in force_parallel_mode mode. We previously guarded against that by deciding the plan wasn't wholePlanParallelSafe if any subplans had been found, but after commit `5ce5e4a12` it's necessary to have this substitute test, because path parallel_safe markings don't account for initPlans. (Normally, we'd have decided the paths weren't safe anyway due to appearances of SubPlan nodes, Params, or CTE scans somewhere in the tree --- but it's possible for those all to be optimized away while initPlans still remain.) Per fuzz testing by Andreas Seltenreich. Report: <874m89rw7x.fsf@credativ.de>	2016-07-01 20:06:05 -04:00
Andres Freund	48bfeb244f	Improve WritebackContextInit() comment and prototype argument names. Author: Masahiko Sawada Discussion: CAD21AoBD=Of1OzL90Xx4Q-3j=-2q7=S87cs75HfutE=eCday2w@mail.gmail.com	2016-07-01 14:29:03 -07:00
Tom Lane	548af97fce	Provide and use a makefile target to build all generated headers. As of 9.6, pg_regress doesn't build unless storage/lwlocknames.h has been created; but there was nothing forcing that to happen if you just went into src/test/regress/ and built there. We previously had a similar complaint about plpython. To fix in a way that won't break next time we invent a generated header, make src/backend/Makefile expose a phony target for updating all the include files it builds, and invoke that before building pg_regress or plpython. In principle, maybe we ought to invoke that everywhere; but it would add a lot of usually-useless make cycles, so let's just do it in the places where people have complained. I made a couple of cosmetic adjustments in src/backend/Makefile as well, to deal with the generated headers in consistent orders. Michael Paquier and Tom Lane Report: <31398.1467036827@sss.pgh.pa.us> Report: <20150916200959.GB32090@msg.df7cb.de>	2016-07-01 15:09:02 -04:00
Alvaro Herrera	1bdae16fca	walreceiver: tweak pg_stat_wal_receiver behavior There are two problems in the original coding: one is that if one walreceiver process exits, the ready_to_display flag remains set in shared memory, exposing the conninfo of the next walreceiver before obfuscating. Fix by having WalRcvDie reset the flag. Second, the sleep-and-retry behavior that waited until walreceiver had set ready_to_display wasn't liked; the preference is to have it return no data instead, so let's do that. Bugs in `9ed551e0a` reported by Fujii Masao and Michël Paquier. Author: Michaël Paquier	2016-07-01 13:53:46 -04:00
Tom Lane	9e703987a8	Rethink the GetForeignUpperPaths API (again). In the previous design, the GetForeignUpperPaths FDW callback hook was called before we got around to labeling upper relations with the proper consider_parallel flag; this meant that any upper paths created by an FDW would be marked not-parallel-safe. While that's probably just as well right now, we aren't going to want it to be true forever. Hence, abandon the idea that FDWs should be allowed to inject upper paths before the core code has gotten around to creating the relevant upper relation. (Well, actually they still can, but it's on their own heads how well it works.) Instead, adopt the same API already designed for create_upper_paths_hook: we call GetForeignUpperPaths after each upperrel has been created and populated with the paths the core planner knows how to make.	2016-07-01 13:12:34 -04:00
Robert Haas	5ce5e4a12e	Set consider_parallel correctly for upper planner rels. Commit `3fc6e2d7f5` introduced new "upper" RelOptInfo structures but didn't set consider_parallel for them correctly, a point I completely missed when reviewing it. Later, commit `e06a38965b` made the situation worse by doing it incorrectly for the grouping relation. Try to straighten all of that out. Along the way, get rid of the annoying wholePlanParallelSafe flag, which was only necessarily because of the fact that upper planning stages didn't use paths at the time that code was written. The most important immediate impact of these changes is that force_parallel_mode will provide useful test coverage in quite a few more scenarios than it did previously, but it's also necessary preparation for fixing some problems related to subqueries. Patch by me, reviewed by Tom Lane.	2016-07-01 11:52:56 -04:00
Tom Lane	0daeba0e92	Be more paranoid in ruleutils.c's get_variable(). We were merely Assert'ing that the Var matched the RTE it's supposedly from. But if the user passes incorrect information to pg_get_expr(), the RTE might in fact not match; this led either to Assert failures or core dumps, as reported by Chris Hanks in bug #14220. To fix, just convert the Asserts to test-and-elog. Adjust an existing test-and-elog elsewhere in the same function to be consistent in wording. (If we really felt these were user-facing errors, we might promote them to ereport's; but I can't convince myself that they're worth translating.) Back-patch to 9.3; the problematic code doesn't exist before that, and a quick check says that 9.2 doesn't crash on such cases. Michael Paquier and Thomas Munro Report: <20160629224349.1407.32667@wrigleys.postgresql.org>	2016-07-01 11:40:33 -04:00
Robert Haas	4f9f495889	Fix crash bug in RestoreSnapshot. If serialized_snapshot->subxcnt > 0 and serialized_snapshot->xcnt == 0, the old coding would do the wrong thing and crash. This can happen on standby servers. Report by Andreas Seltenreich. Patch by Thomas Munro, reviewed by Amit Kapila and tested by Andreas Seltenreich.	2016-07-01 09:04:11 -04:00
Robert Haas	10c0558ffe	Fix several mistakes around parallel workers and client_encoding. Previously, workers sent data to the leader using the client encoding. That mostly worked, but the leader the converted the data back to the server encoding. Since not all encoding conversions are reversible, that could provoke failures. Fix by using the database encoding for all communication between worker and leader. Also, while temporary changes to GUC settings, as from the SET clause of a function, are in general OK for parallel query, changing client_encoding this way inside of a parallel worker is not OK. Previously, that would have confused the leader; with these changes, it would not confuse the leader, but it wouldn't do anything either. So refuse such changes in parallel workers. Also, the previous code naively assumed that when it received a NotifyResonse from the worker, it could pass that directly back to the user. But now that worker-to-leader communication always uses the database encoding, that's clearly no longer correct - though, actually, the old way was always broken for V2 clients. So disassemble and reconstitute the message instead. Issues reported by Peter Eisentraut. Patch by me, reviewed by Peter Eisentraut.	2016-06-30 18:35:32 -04:00
Tom Lane	f8c58554db	Fix typo in ReorderBufferIterTXNInit(). This looks like it would cause changes from subtransactions to be missed by the iterator being constructed, if those changes had been spilled to disk previously. This implies that large subtransactions might be lost (in whole or in part) by logical replication. Found and fixed by Petru-Florin Mihancea, per bug #14208. Report: <20160622144830.5791.22512@wrigleys.postgresql.org>	2016-06-30 12:37:02 -04:00
Tom Lane	3154e16737	Dodge compiler bug in Visual Studio 2013. VS2013 apparently has a problem with taking the address of a formal parameter in some cases. We do that elsewhere without trouble, but in this case the address is being passed to a subroutine that will probably get inlined, so maybe the combination of those things is what tickles the bug. Anyway, introducing an extra copy of the parameter value is enough to work around it. Per trouble report from Umair Shahid. Report: <CAM184AcjqKYZSdQqBHDrnENXHhW=mXbUC46QYPJ=nAh0gUHCGA@mail.gmail.com>	2016-06-29 19:07:19 -04:00
Tom Lane	8ebb69f854	Fix some infelicities in EXPLAIN output for parallel query plans. In non-text output formats, parallelized aggregates were reporting "Partial" or "Finalize" as a field named "Operation", which might be all right in the absence of any context --- but other plan node types use that field to report SQL-visible semantics, such as Select/Insert/Update/Delete. So that naming choice didn't seem good to me. I changed it to "Partial Mode". Also, the field did not appear at all for a non-parallelized Agg plan node, which is contrary to expectation in non-text formats. We're notionally producing objects that conform to a schema, so the set of fields for a given node type and EXPLAIN mode should be well-defined. I set it up to fill in "Simple" in such cases. Other fields that were added for parallel query, namely "Parallel Aware" and Gather's "Single Copy", had not gotten the word on that point either. Make them appear always in non-text output. Also, the latter two fields were nominally producing boolean output, but were getting it wrong, because bool values shouldn't be quoted in JSON or YAML. Somehow we'd not needed an ExplainPropertyBool formatting subroutine before 9.6; but now we do, so invent it. Discussion: <16002.1466972724@sss.pgh.pa.us>	2016-06-29 18:51:20 -04:00
Alvaro Herrera	9ed551e0a4	Add conninfo to pg_stat_wal_receiver Commit `b1a9bad9e7` introduced a stats view to provide insight into the running WAL receiver, but neglected to include the connection string in it, as reported by Michaël Paquier. This commit fixes that omission. (Any security-sensitive information is not disclosed). While at it, close the mild security hole that we were exposing the password in the connection string in shared memory. This isn't user-accessible, but it still looks like a good idea to avoid having the cleartext password in memory. Author: Michaël Paquier, Álvaro Herrera Review by: Vik Fearing Discussion: https://www.postgresql.org/message-id/CAB7nPqStg4M561obo7ryZ5G+fUydG4v1Ajs1xZT1ujtu+woRag@mail.gmail.com	2016-06-29 16:57:17 -04:00
Tom Lane	b32e63506c	Fix match_foreign_keys_to_quals for FKs linking to unused rtable entries. Since get_relation_foreign_keys doesn't try to determine whether RTEs are actually part of the query semantics, it might make FK info records linking to RTEs that won't have a RelOptInfo at all. Cope with that. Per bug #14219 from Andrew Gierth. Report: <20160629183338.1397.43514@wrigleys.postgresql.org>	2016-06-29 16:02:08 -04:00
Robert Haas	8dee039fa1	Fix obsolete comment. Commit `3bd261ca18` should have updated this, but didn't. Extracted from a larger patch by Piotr Stefaniak.	2016-06-29 13:12:50 -04:00
Alvaro Herrera	b78364df18	Remove unused arguments in two GiST subroutines These arguments became unused in commit `2c03216d83`. Noticed while skimming code for unrelated development. This is cosmetic, so no backpatch.	2016-06-28 16:01:13 -04:00
Tom Lane	c12f02ffc9	Don't apply sortgroupref labels to a tlist that might not match. If we need to use a gating Result node for pseudoconstant quals, create_scan_plan() intentionally suppresses use_physical_tlist's checks on whether there are matches for sortgroupref labels, on the grounds that we don't need matches because we can label the Result's projection output properly. However, it then called apply_pathtarget_labeling_to_tlist anyway. This oversight was harmless when written, but in commit `aeb9ae645` I made that function throw an error if there was no match. Thus, the combination of a table scan, pseudoconstant quals, and a non-simple-Var sortgroupref column threw the dreaded "ORDER/GROUP BY expression not found in targetlist" error. To fix, just skip applying the labeling in this case. Per report from Rushabh Lathia. Report: <CAGPqQf2iLB8t6t-XrL-zR233DFTXxEsfVZ4WSqaYfLupEoDxXA@mail.gmail.com>	2016-06-28 10:43:11 -04:00
Tom Lane	874fe3aea1	Fix CREATE MATVIEW/CREATE TABLE AS ... WITH NO DATA to not plan the query. Previously, these commands always planned the given query and went through executor startup before deciding not to actually run the query if WITH NO DATA is specified. This behavior is problematic for pg_dump because it may cause errors to be raised that we would rather not see before a REFRESH MATERIALIZED VIEW command is issued. See for example bug #13907 from Marian Krucina. This change is not sufficient to fix that particular bug, because we also need to tweak pg_dump to issue the REFRESH later, but it's a necessary step on the way. A user-visible side effect of doing things this way is that the returned command tag for WITH NO DATA cases will now be "CREATE MATERIALIZED VIEW" or "CREATE TABLE AS", not "SELECT 0". We could preserve the old behavior but it would take more code, and arguably that was just an implementation artifact not intended behavior anyhow. In 9.5 and HEAD, also get rid of the static variable CreateAsReladdr, which was trouble waiting to happen; there is not any prohibition on nested CREATE commands. Back-patch to 9.3 where CREATE MATERIALIZED VIEW was introduced. Michael Paquier and Tom Lane Report: <20160202161407.2778.24659@wrigleys.postgresql.org>	2016-06-27 15:57:50 -04:00
Teodor Sigaev	6734a1cacd	Change predecence of phrase operator. <-> operator now have higher predecence than & (AND) operator. This change was motivated by unexpected difference of similar queries: 'a & b <-> c'::tsquery and 'b <-> c & a'. Before first query means (a & b) <-> c and second one - '(b <-> c) & a', now phrase operator evaluates first. Per suggestion from Tom Lane 32260.1465402409@sss.pgh.pa.us	2016-06-27 20:55:24 +03:00
Teodor Sigaev	3dbbd0f02a	Do not fallback to AND for FTS phrase operator. If there is no positional information of lexemes then phrase operator will not fallback to AND operator. This change makes needing to modify TS_execute() interface, because somewhere (in indexes, for example) positional information is unaccesible and in this cases we need to force fallback to AND. Per discussion c19fcfec308e6ccd952cdde9e648b505@mail.gmail.com	2016-06-27 20:47:32 +03:00
Teodor Sigaev	028350f619	Make exact distance match for FTS phrase operator Phrase operator now requires exact distance betweens lexems instead of less-or-equal. Per discussion c19fcfec308e6ccd952cdde9e648b505@mail.gmail.com	2016-06-27 20:41:00 +03:00
Tom Lane	f1993038a4	Avoid making a separate pass over the query to check for partializability. It's rather silly to make a separate pass over the tlist + HAVING qual, and a separate set of visits to the syscache, when get_agg_clause_costs already has all the required information in hand. This nets out as less code as well as fewer cycles.	2016-06-26 15:55:01 -04:00
Tom Lane	19e972d558	Rethink node-level representation of partial-aggregation modes. The original coding had three separate booleans representing partial aggregation behavior, which was confusing, unreadable, and error-prone, not least because the booleans weren't always listed in the same order. It was also inadequate for the allegedly-desirable future extension to support intermediate partial aggregation, because we'd need separate markers for serialization and deserialization in such a case. Merge these bools into an enum "AggSplit" to provide symbolic names for the supported operating modes (and document what those are). By assigning the values of the enum constants carefully, we can treat AggSplit values as options bitmasks so that tests of what to do aren't noticeably more expensive than before. While at it, get rid of Aggref.aggoutputtype. That's not needed since commit `59a3795c2` got rid of setrefs.c's special-purpose Aggref comparison code, and it likewise seemed more confusing than helpful. Assorted comment cleanup as well (there's still more that I want to do in that line). catversion bump for change in Aggref node contents. Should be the last one for partial-aggregation changes. Discussion: <29309.1466699160@sss.pgh.pa.us>	2016-06-26 14:33:38 -04:00
Tom Lane	59a3795c25	Simplify planner's final setup of Aggrefs for partial aggregation. Commit e06a38965's original coding for constructing the execution-time expression tree for a combining aggregate was rather messy, involving duplicating quite a lot of code in setrefs.c so that it could inject a nonstandard matching rule for Aggrefs. Get rid of that in favor of explicitly constructing a combining Aggref with a partial Aggref as input, then allowing setref's normal matching logic to match the partial Aggref to the output of the lower plan node and hence replace it with a Var. In passing, rename and redocument make_partialgroup_input_target to have some connection to what it actually does.	2016-06-26 12:08:12 -04:00
Alvaro Herrera	e3ad3ffa68	Fix handling of multixacts predating pg_upgrade After pg_upgrade, it is possible that some tuples' Xmax have multixacts corresponding to the old installation; such multixacts cannot have running members anymore. In many code sites we already know not to read them and clobber them silently, but at least when VACUUM tries to freeze a multixact or determine whether one needs freezing, there's an attempt to resolve it to its member transactions by calling GetMultiXactIdMembers, and if the multixact value is "in the future" with regards to the current valid multixact range, an error like this is raised: ERROR: MultiXactId 123 has not been created yet -- apparent wraparound and vacuuming fails. Per discussion with Andrew Gierth, it is completely bogus to try to resolve multixacts coming from before a pg_upgrade, regardless of where they stand with regards to the current valid multixact range. It's possible to get from under this problem by doing SELECT FOR UPDATE of the problem tuples, but if tables are large, this is slow and tedious, so a more thorough solution is desirable. To fix, we realize that multixacts in xmax created in 9.2 and previous have a specific bit pattern that is never used in 9.3 and later (we already knew this, per comments and infomask tests sprinkled in various places, but we weren't leveraging this knowledge appropriately). Whenever the infomask of the tuple matches that bit pattern, we just ignore the multixact completely as if Xmax wasn't set; or, in the case of tuple freezing, we act as if an unwanted value is set and clobber it without decoding. This guarantees that no errors will be raised, and that the values will be progressively removed until all tables are clean. Most callers of GetMultiXactIdMembers are patched to recognize directly that the value is a removable "empty" multixact and avoid calling GetMultiXactIdMembers altogether. To avoid changing the signature of GetMultiXactIdMembers() in back branches, we keep the "allow_old" boolean flag but rename it to "from_pgupgrade"; if the flag is true, we always return an empty set instead of looking up the multixact. (I suppose we could remove the argument in the master branch, but I chose not to do so in this commit). This was broken all along, but the error-facing message appeared first because of commit `8e9a16ab8f` and was partially fixed in `a25c2b7c4d`. This fix, backpatched all the way back to 9.3, goes approximately in the same direction as `a25c2b7c4d` but should cover all cases. Bug analysis by Andrew Gierth and Álvaro Herrera. A number of public reports match this bug: https://www.postgresql.org/message-id/20140330040029.GY4582@tamriel.snowman.net https://www.postgresql.org/message-id/538F3D70.6080902@publicrelay.com https://www.postgresql.org/message-id/556439CF.7070109@pscs.co.uk https://www.postgresql.org/message-id/SG2PR06MB0760098A111C88E31BD4D96FB3540@SG2PR06MB0760.apcprd06.prod.outlook.com https://www.postgresql.org/message-id/20160615203829.5798.4594@wrigleys.postgresql.org	2016-06-24 18:29:28 -04:00
Tom Lane	8cf739de85	Fix building of large (bigger than shared_buffers) hash indexes. When the index is predicted to need more than NBuffers buckets, CREATE INDEX attempts to sort the index entries by hash key before insertion, so as to reduce thrashing. This code path got broken by commit `9f03ca9151`, which overlooked that _hash_form_tuple() is not just an alias for index_form_tuple(). The index got built anyway, but with garbage data, so that searches for pre-existing tuples always failed. Fix by refactoring to separate construction of the indexable data from calling index_form_tuple(). Per bug #14210 from Daniel Newman. Back-patch to 9.5 where the bug was introduced. Report: <20160623162507.17237.39471@wrigleys.postgresql.org>	2016-06-24 16:57:36 -04:00
Tom Lane	bd1693e89e	Fix small memory leak in partial-aggregate deserialization functions. A deserialize function's result is short-lived data during partial aggregation, since we're just going to pass it to the combine function and then it's of no use anymore. However, the built-in deserialize functions allocated their results in the aggregate state context, resulting in a query-lifespan memory leak. It's probably not possible for this to amount to anything much at present, since the number of leaked results would only be the number of worker processes. But it might become a problem in future. To fix, don't use the same convenience subroutine for setting up results that the aggregate transition functions use. David Rowley Report: <10050.1466637736@sss.pgh.pa.us>	2016-06-23 10:55:59 -04:00
Tom Lane	f8ace5477e	Fix type-safety problem with parallel aggregate serial/deserialization. The original specification for this called for the deserialization function to have signature "deserialize(serialtype) returns transtype", which is a security violation if transtype is INTERNAL (which it always would be in practice) and serialtype is not (which ditto). The patch blithely overrode the opr_sanity check for that, which was sloppy-enough work in itself, but the indisputable reason this cannot be allowed to stand is that CREATE FUNCTION will reject such a signature and thus it'd be impossible for extensions to create parallelizable aggregates. The minimum fix to make the signature type-safe is to add a second, dummy argument of type INTERNAL. But to lock it down a bit more and make misuse of INTERNAL-accepting functions less likely, let's get rid of the ability to specify a "serialtype" for an aggregate and just say that the only useful serialtype is BYTEA --- which, in practice, is the only interesting value anyway, due to the usefulness of the send/recv infrastructure for this purpose. That means we only have to allow "serialize(internal) returns bytea" and "deserialize(bytea, internal) returns internal" as the signatures for these support functions. In passing fix bogus signature of int4_avg_combine, which I found thanks to adding an opr_sanity check on combinefunc signatures. catversion bump due to removing pg_aggregate.aggserialtype and adjusting signatures of assorted built-in functions. David Rowley and Tom Lane Discussion: <27247.1466185504@sss.pgh.pa.us>	2016-06-22 16:52:41 -04:00
Tom Lane	e45e990e4b	Make "postgres -C guc" print "" not "(null)" for null-valued GUCs. Commit `0b0baf262` et al made this case print "(null)" on the grounds that that's what happened on platforms that didn't crash. But neither behavior was actually intentional. What we should print is just an empty string, for compatibility with the behavior of SHOW and other ways of examining string GUCs. Those code paths don't distinguish NULL from empty strings, so we should not here either. Per gripe from Alain Radix. Like the previous patch, back-patch to 9.2 where -C option was introduced. Discussion: <CA+YdpwxPUADrmxSD7+Td=uOshMB1KkDN7G7cf+FGmNjjxMhjbw@mail.gmail.com>	2016-06-22 11:55:18 -04:00
Bruce Momjian	3e9df858f0	Update comment about allowing GUCs to change scanning. Reported-by: David G. Johnston Discussion: CAKFQuwZZvnxwSq9tNtvL+uyuDKGgV91zR_agtPxQHRWMWQRP8g@mail.gmail.com	2016-06-21 20:23:31 -04:00
Tom Lane	8b9d323cb9	Refactor planning of projection steps that don't need a Result plan node. The original upper-planner-pathification design (commit `3fc6e2d7f5`) assumed that we could always determine during Path formation whether or not we would need a Result plan node to perform projection of a targetlist. That turns out not to work very well, though, because createplan.c still has some responsibilities for choosing the specific target list associated with sorting/grouping nodes (in particular it might choose to add resjunk columns for sorting). We might not ever refactor that --- doing so would push more work into Path formation, which isn't attractive --- and we certainly won't do so for 9.6. So, while create_projection_path and apply_projection_to_path can tell for sure what will happen if the subpath is projection-capable, they can't tell for sure when it isn't. This is at least a latent bug in apply_projection_to_path, which might think it can apply a target to a non-projecting node when the node will end up computing something different. Also, I'd tied the creation of a ProjectionPath node to whether or not a Result is needed, but it turns out that we sometimes need a ProjectionPath node anyway to avoid modifying a possibly-shared subpath node. Callers had to use create_projection_path for such cases, and we added code to them that knew about the potential omission of a Result node and attempted to adjust the cost estimates for that. That was uncertainly correct and definitely ugly/unmaintainable. To fix, have create_projection_path explicitly check whether a Result is needed and adjust its cost estimate accordingly, though it creates a ProjectionPath in either case. apply_projection_to_path is now mostly just an optimized version that can avoid creating an extra Path node when the input is known to not be shared with any other live path. (There is one case that create_projection_path doesn't handle, which is pushing parallel-safe expressions below a Gather node. We could make it do that by duplicating the GatherPath, but there seems no need as yet.) create_projection_plan still has to recheck the tlist-match condition, which means that if the matching situation does get changed by createplan.c then we'll have made a slightly incorrect cost estimate. But there seems no help for that in the near term, and I doubt it occurs often enough, let alone would change planning decisions often enough, to be worth stressing about. I added a "dummypp" field to ProjectionPath to track whether create_projection_path thinks a Result is needed. This is not really necessary as-committed because create_projection_plan doesn't look at the flag; but it seems like a good idea to remember what we thought when forming the cost estimate, if only for debugging purposes. In passing, get rid of the target_parallel parameter added to apply_projection_to_path by commit `54f5c5150`. I don't think that's a good idea because it involves callers in what should be an internal decision, and opens us up to missing optimization opportunities if callers think they don't need to provide a valid flag, as most don't. For the moment, this just costs us an extra has_parallel_hazard call when planning a Gather. If that starts to look expensive, I think a better solution would be to teach PathTarget to carry/cache knowledge of parallel-safety of its contents.	2016-06-21 18:38:20 -04:00
Peter Eisentraut	47981a4665	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 0c374f8d25ed31833a10d24252bc928d41438838	2016-06-20 09:48:08 -04:00
Tom Lane	9bc3332372	Improve error message annotation for GRANT/REVOKE on untrusted PLs. The annotation for "ERROR: language "foo" is not trusted" used to say "HINT: Only superusers can use untrusted languages", which was fairly poorly thought out. For one thing, it's not a hint about what to do, but a statement of fact, which makes it errdetail. But also, this fails to clarify things much, because there's a missing step in the chain of reasoning. I think it's more useful to say "GRANT and REVOKE are not allowed on untrusted languages, because only superusers can use untrusted languages". It's been like this for a long time, but given the lack of previous complaints, I don't think this is worth back-patching. Discussion: <1417.1466289901@sss.pgh.pa.us>	2016-06-18 19:38:59 -04:00
Tom Lane	100340e2dc	Restore foreign-key-aware estimation of join relation sizes. This patch provides a new implementation of the logic added by commit `137805f89` and later removed by `77ba61080`. It differs from the original primarily in expending much less effort per joinrel in large queries, which it accomplishes by doing most of the matching work once per query not once per joinrel. Hopefully, it's also less buggy and better commented. The never-documented enable_fkey_estimates GUC remains gone. There remains work to be done to make the selectivity estimates account for nulls in FK referencing columns; but that was true of the original patch as well. We may be able to address this point later in beta. In the meantime, any error should be in the direction of overestimating rather than underestimating joinrel sizes, which seems like the direction we want to err in. Tomas Vondra and Tom Lane Discussion: <31041.1465069446@sss.pgh.pa.us>	2016-06-18 15:22:34 -04:00
Tom Lane	598aa194af	Still another try at fixing scanjoin_target insertion into parallel plans. The previous code neglected the fact that the scanjoin_target might carry sortgroupref labelings that we need to absorb. Instead, do create_projection_path() unconditionally, and tweak the path's cost estimate after the fact. (I'm now convinced that we ought to refactor the way we account for sometimes not needing a separate projection step, but right now is not the time for that sort of cleanup.) Problem identified by Amit Kapila, patch by me.	2016-06-18 00:28:51 -04:00
Tom Lane	915b703e16	Fix handling of argument and result datatypes for partial aggregation. When doing partial aggregation, the args list of the upper (combining) Aggref node is replaced by a Var representing the output of the partial aggregation steps, which has either the aggregate's transition data type or a serialized representation of that. However, nodeAgg.c blindly continued to use the args list as an indication of the user-level argument types. This broke resolution of polymorphic transition datatypes at executor startup (though it accidentally failed to fail for the ANYARRAY case, which is likely the only one anyone had tested). Moreover, the constructed FuncExpr passed to the finalfunc contained completely wrong information, which would have led to bogus answers or crashes for any case where the finalfunc examined that information (which is only likely to be with polymorphic aggregates using a non-polymorphic transition type). As an independent bug, apply_partialaggref_adjustment neglected to resolve a polymorphic transition datatype before assigning it as the output type of the lower-level Aggref node. This again accidentally failed to fail for ANYARRAY but would be unlikely to work in other cases. To fix the first problem, record the user-level argument types in a separate OID-list field of Aggref, and look to that rather than the args list when asking what the argument types were. (It turns out to be convenient to include any "direct" arguments in this list too, although those are not currently subject to being overwritten.) Rather than adding yet another resolve_aggregate_transtype() call to fix the second problem, add an aggtranstype field to Aggref, and store the resolved transition type OID there when the planner first computes it. (By doing this in the planner and not the parser, we can allow the aggregate's transition type to change from time to time, although no DDL support yet exists for that.) This saves nothing of consequence for simple non-polymorphic aggregates, but for polymorphic transition types we save a catalog lookup during executor startup as well as several planner lookups that are new in 9.6 due to parallel query planning. In passing, fix an error that was introduced into count_agg_clauses_walker some time ago: it was applying exprTypmod() to something that wasn't an expression node at all, but a TargetEntry. exprTypmod silently returned -1 so that there was not an obvious failure, but this broke the intended sensitivity of aggregate space consumption estimates to the typmod of varchar and similar data types. This part needs to be back-patched. Catversion bump due to change of stored Aggref nodes. Discussion: <8229.1466109074@sss.pgh.pa.us>	2016-06-17 21:44:37 -04:00
Alvaro Herrera	1ca4a1b5d2	Finish up XLOG_HINT renaming Commit `b8fd1a09f3` renamed XLOG_HINT to XLOG_FPI, but neglected two places. Backpatch to 9.3, like that commit.	2016-06-17 18:05:55 -04:00
Robert Haas	71d05a2c7b	pg_visibility: Add pg_truncate_visibility_map function. This requires some core changes as well so that we can properly WAL-log the truncation. Specifically, it changes the format of the XLOG_SMGR_TRUNCATE WAL record, so bump XLOG_PAGE_MAGIC. Patch by me, reviewed but not fully endorsed by Andres Freund.	2016-06-17 17:37:30 -04:00
Robert Haas	54f5c5150f	Try again to fix the way the scanjoin_target is used with partial paths. Commit `04ae11f62e` removed some broken code to apply the scan/join target to partial paths, but its theory that this processing step is totally unnecessary turns out to be wrong. Put similar code back again, but this time, check for parallel-safety and avoid in-place modifications to paths that may already have been used as part of some other path. (This is not an entirely elegant solution to this problem; it might be better, for example, to postpone generate_gather_paths for the topmost scan/join rel until after the scan/join target has been applied. But this is not the time for such redesign work.) Amit Kapila and Robert Haas	2016-06-17 16:29:07 -04:00
Robert Haas	ede62e56fb	Add VACUUM (DISABLE_PAGE_SKIPPING) for emergencies. If you really want to vacuum every single page in the relation, regardless of apparent visibility status or anything else, you can use this option. In previous releases, this behavior could be achieved using VACUUM (FREEZE), but because we can now recognize all-frozen pages as not needing to be frozen again, that no longer works. There should be no need for routine use of this option, but maybe bugs or disaster recovery will necessitate its use. Patch by me, reviewed by Andres Freund.	2016-06-17 15:48:57 -04:00
Robert Haas	9c188a8454	Fix typo. Thomas Munro	2016-06-17 12:55:30 -04:00
Robert Haas	292794f82b	Remove PID from 'parallel worker' context message. Discussion: <bfd204ab-ab1a-792a-b345-0274a09a4b5f@2ndquadrant.com>	2016-06-17 09:26:17 -04:00
Tom Lane	4c56f3269a	Fix validation of overly-long IPv6 addresses. The inet/cidr types sometimes failed to reject IPv6 inputs with too many colon-separated fields, instead translating them to '::/0'. This is the result of a thinko in the original ISC code that seems to be as yet unreported elsewhere. Per bug #14198 from Stefan Kaltenbrunner. Report: <20160616182222.5798.959@wrigleys.postgresql.org>	2016-06-16 17:16:32 -04:00
Tom Lane	bfb937427b	Fix fuzzy thinking in ReinitializeParallelDSM(). The fact that no workers were successfully launched in the previous iteration does not excuse us from setting up properly to try again. This appears to explain crashes I saw in parallel regression testing due to error_mqh being NULL when it shouldn't be. Minor other cosmetic fixes too.	2016-06-16 15:20:29 -04:00
Tom Lane	75be66464c	Invent min_parallel_relation_size GUC to replace a hard-wired constant. The main point of doing this is to allow the cutoff to be set very small, even zero, to allow parallel-query behavior to be tested on relatively small tables such as we typically use in the regression tests. But it might be of use to users too. The number-of-workers scaling behavior in create_plain_partial_paths() is pretty ad-hoc and subject to change, so we won't expose anything about that, but the notion of not considering parallel query at all for tables below size X seems reasonably stable. Amit Kapila, per a suggestion from me Discussion: <17170.1465830165@sss.pgh.pa.us>	2016-06-16 13:47:20 -04:00
Tom Lane	0b0baf2621	Avoid crash in "postgres -C guc" for a GUC with a null string value. Emit "(null)" instead, which was the behavior all along on platforms that don't crash, eg OS X. Per report from Jehan-Guillaume de Rorthais. Back-patch to 9.2 where -C option was introduced. Michael Paquier Report: <20160615204036.2d35d86a@firost>	2016-06-16 12:17:38 -04:00
Robert Haas	38e9f90a22	Fix lazy_scan_heap so that it won't mark pages all-frozen too soon. Commit `a892234f83` added a new bit per page to the visibility map fork indicating whether the page is all-frozen, but incorrectly assumed that if lazy_scan_heap chose to freeze a tuple then that tuple would not need to later be frozen again. This turns out to be false, because xmin and xmax (and conceivably xvac, if dealing with tuples from very old releases) could be frozen at separate times. Thanks to Andres Freund for help in uncovering and tracking down this issue.	2016-06-15 14:30:06 -04:00
Tom Lane	8383486f10	Force idle_in_transaction_session_timeout off in pg_dump and autovacuum. We disable statement_timeout and lock_timeout during dump and restore, to prevent any global settings that might exist from breaking routine backups. Commit `c6dda1f48` should have added idle_in_transaction_session_timeout to that list, but failed to. Another place where these timeouts get turned off is autovacuum. While I doubt an idle timeout could fire there, it seems better to be safe than sorry. pg_dump issue noted by Bernd Helmle, the other one found by grepping. Report: <352F9B77DB5D3082578D17BB@eje.land.credativ.lan>	2016-06-15 10:53:03 -04:00
Tom Lane	783cb6e48b	Fix multiple minor infelicities in aclchk.c error reports. pg_type_aclmask reported the wrong type's OID when complaining that it could not find a type's typelem. It also failed to provide a suitable errcode when the initially given OID doesn't exist (which is a user-facing error, since that OID can be user-specified). pg_foreign_data_wrapper_aclmask and pg_foreign_server_aclmask likewise lacked errcode specifications. Trivial cosmetic adjustments too. The wrong-type-OID problem was reported by Petru-Florin Mihancea in bug #14186; the other issues noted by me while reading the code. These errors all seem to be aboriginal in the respective routines, so back-patch as necessary. Report: <20160613163159.5798.52928@wrigleys.postgresql.org>	2016-06-13 13:53:10 -04:00
Tom Lane	89d53515e5	In planner.c, avoid assuming that all PathTargets have sortgrouprefs. The struct definition for PathTarget specifies that a NULL sortgrouprefs pointer means no sortgroupref labels. While it's likely that there should always be at least one labeled column in the places that were unconditionally fetching through the pointer, it seems wiser to adhere to the data structure specification and test first. Add a macro to make this convenient. Per experimentation with running the regression tests with a very small parallelization threshold --- the crash I observed may well represent a bug elsewhere, but still this coding was not very robust. Report: <20756.1465834072@sss.pgh.pa.us>	2016-06-13 12:59:25 -04:00
Noah Misch	3be0a62ffe	Finish pgindent run for 9.6: Perl files.	2016-06-12 04:19:56 -04:00
Andres Freund	4bc0f165cb	Change default of backend_flush_after GUC to 0 (disabled). While beneficial, both for throughput and average/worst case latency, in a significant number of workloads, there are other workloads in which backend_flush_after can cause significant performance regressions in comparison to < 9.6 releases. The regression is most likely when the hot data set is bigger than shared buffers, but significantly smaller than the operating system's page cache. I personally think that the benefit of enabling backend flush control is considerably bigger than the potential downsides, but a fair argument can be made that not regressing is more important than improving performance/latency. As the latter is the consensus, change the default to 0. The other settings introduced in `428b1d6b2` do not have the same potential for regressions, so leave them enabled. Benchmarks leading up to changing the default have been performed by Mithun Cy, Ashutosh Sharma and Robert Haas. Discussion: CAD__OuhPmc6XH=wYRm_+Q657yQE88DakN4=Ybh2oveFasHkoeA@mail.gmail.com	2016-06-10 15:31:11 -07:00
Tom Lane	3303ea1a32	Remove reltarget_has_non_vars flag. Commit `b12fd41c6` added a "reltarget_has_non_vars" field to RelOptInfo, but failed to maintain it accurately. Since its only purpose was to skip calls to has_parallel_hazard() in the simple case where a rel's targetlist is all Vars, and that call is really pretty cheap in that case anyway, it seems like this is just a case of premature optimization. Let's drop the flag and do the calls unconditionally until it's proven that we need more smarts here.	2016-06-10 16:20:03 -04:00
Tom Lane	2f153ddfdd	Refactor to reduce code duplication for function property checking. As noted by Andres Freund, we'd accumulated quite a few similar functions in clauses.c that examine all functions in an expression tree to see if they satisfy some boolean test. Reduce the duplication by inventing a function check_functions_in_node() that applies a simple callback function to each SQL function OID appearing in a given expression node. This also fixes some arguable oversights; for example, contain_mutable_functions() did not check aggregate or window functions for mutability. I doubt that that represents a live bug at the moment, because we don't really consider mutability for aggregates; but it might someday be one. I chose to put check_functions_in_node() in nodeFuncs.c because it seemed like other modules might wish to use it in future. That in turn forced moving set_opfuncid() et al into nodeFuncs.c, as the alternative was for nodeFuncs.c to depend on optimizer/setrefs.c which didn't seem very clean. In passing, teach contain_leaked_vars_walker() about a few more expression node types it can safely look through, and improve the rather messy and undercommented code in has_parallel_hazard_walker(). Discussion: <20160527185853.ziol2os2zskahl7v@alap3.anarazel.de>	2016-06-10 16:03:46 -04:00
Kevin Grittner	13761bccb1	Rename local variable for consistency. Pointed out by Robert Haas.	2016-06-10 11:24:01 -05:00
Kevin Grittner	bf9a60ee33	Fix interaction between CREATE INDEX and "snapshot too old". Since indexes are created without valid LSNs, an index created while a snapshot older than old_snapshot_threshold existed could cause queries to return incorrect results when those old snapshots were used, if any relevant rows had been subject to early pruning before the index was built. Prevent usage of a newly created index until all such snapshots are released, for relations where this can happen. Questions about the interaction of "snapshot too old" with index creation were initially raised by Andres Freund. Reviewed by Robert Haas.	2016-06-10 09:25:31 -05:00
Tom Lane	cae1c788b9	Improve the situation for parallel query versus temp relations. Transmit the leader's temp-namespace state to workers. This is important because without it, the workers do not really have the same search path as the leader. For example, there is no good reason (and no extant code either) to prevent a worker from executing a temp function that the leader created previously; but as things stood it would fail to find the temp function, and then either fail or execute the wrong function entirely. We still prohibit a worker from creating a temp namespace on its own. In effect, a worker can only see the session's temp namespace if the leader had created it before starting the worker, which seems like the right semantics. Also, transmit the leader's BackendId to workers, and arrange for workers to use that when determining the physical file path of a temp relation belonging to their session. While the original intent was to prevent such accesses entirely, there were a number of holes in that, notably in places like dbsize.c which assume they can safely access temp rels of other sessions anyway. We might as well get this right, as a small down payment on someday allowing workers to access the leader's temp tables. (With this change, directly using "MyBackendId" as a relation or buffer backend ID is deprecated; you should use BackendIdForTempRelations() instead. I left a couple of such uses alone though, as they're not going to be reachable in parallel workers until we do something about localbuf.c.) Move the thou-shalt-not-access-thy-leader's-temp-tables prohibition down into localbuf.c, which is where it actually matters, instead of having it in relation_open(). This amounts to recognizing that access to temp tables' catalog entries is perfectly safe in a worker, it's only the data in local buffers that is problematic. Having done all that, we can get rid of the test in has_parallel_hazard() that says that use of a temp table's rowtype is unsafe in parallel workers. That test was unduly expensive, and if we really did need such a prohibition, that was not even close to being a bulletproof guard for it. (For example, any user-defined function executed in a parallel worker might have attempted such access.)	2016-06-09 20:16:11 -04:00
Robert Haas	2410a2543e	Repair a bit of pgindent damage. Inserting line-breaks into the middle of a URL is, to put it mildly, not very helpful, so persuade pgindent to leave it alone.	2016-06-09 18:09:17 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Robert Haas	b12fd41c69	Don't generate parallel paths for rels with parallel-restricted outputs. Such paths are unsafe. To make it cheaper to detect when this case applies, track whether a relation's default PathTarget contains any non-Vars. In most cases, the answer will be no, which enables us to determine cheaply that the target list for a proposed path is parallel-safe. However, subquery pull-up can create cases that require us to inspect the target list more carefully. Amit Kapila, reviewed by me.	2016-06-09 12:43:36 -04:00
Tom Lane	e4158319f3	Mop-up for parallel degree-ectomy. Fix a couple of overlooked uses of "degree" terminology. Make the parallel worker count selection logic in create_plain_partial_paths more robust (in particular, it failed with max_parallel_workers_per_gather set to zero).	2016-06-09 11:16:26 -04:00
Robert Haas	c9ce4a1c61	Eliminate "parallel degree" terminology. This terminology provoked widespread complaints. So, instead, rename the GUC max_parallel_degree to max_parallel_workers_per_gather (leaving room for a possible future GUC max_parallel_workers that acts as a system-wide limit), and rename the parallel_degree reloption to parallel_workers. Rename structure members to match. These changes create a dump/restore hazard for users of PostgreSQL 9.6beta1 who have set the reloption (or applied the GUC using ALTER USER or ALTER DATABASE).	2016-06-09 10:00:26 -04:00
Alvaro Herrera	4f04b66f97	Fix loose ends for SQL ACCESS METHOD objects COMMENT ON ACCESS METHOD was missing; add it, along psql tab-completion support for it. psql was also missing a way to list existing access methods; the new \dA command does that. Also add tab-completion support for DROP ACCESS METHOD. Author: Michael Paquier Discussion: https://www.postgresql.org/message-id/CAB7nPqTzdZdu8J7EF8SXr_R2U5bSUUYNOT3oAWBZdEoggnwhGA@mail.gmail.com	2016-06-07 17:59:34 -04:00
Tom Lane	77ba610805	Revert "Use Foreign Key relationships to infer multi-column join selectivity". This commit reverts `137805f89` as well as the associated commits `015e88942`, `5306df283`, and `68d704edb`. We found multiple bugs in this feature, and there was concern about possible planner slowdown (though to be fair, exhibiting a very large slowdown proved difficult). The way forward requires a considerable rewrite, which may or may not be possible to accomplish in time for beta2. In my judgment reviewing the rewrite will be easier to accomplish starting from a clean slate, so let's temporarily revert what's there now. This also leaves us in a safe state if it turns out to be necessary to postpone the rewrite to the next development cycle. Discussion: <20160429102531.GA13701@huehner.biz>	2016-06-07 17:21:17 -04:00
Peter Eisentraut	5c6d2a5e7c	Message style and wording fixes	2016-06-07 14:18:55 -04:00
Simon Riggs	1f74a90888	Correct phrasing in dsm.c comments	2016-06-07 17:34:33 +01:00
Stephen Frost	40fc457520	Minor typos / copy-editing for snapmgr.c Noticed while reviewing snapshot management.	2016-06-07 11:14:48 -04:00
Tom Lane	f64340e743	Don't reset changes_since_analyze after a selective-columns ANALYZE. If we ANALYZE only selected columns of a table, we should not postpone auto-analyze because of that; other columns may well still need stats updates. As committed, the counter is left alone if a column list is given, whether or not it includes all analyzable columns of the table. Per complaint from Tomasz Ostrowski. It's been like this a long time, so back-patch to all supported branches. Report: <ef99c1bd-ff60-5f32-2733-c7b504eb960c@ato.waw.pl>	2016-06-06 17:44:17 -04:00
Robert Haas	c6dbf1fe79	Stop the executor if no more tuples can be sent from worker to leader. If a Gather node has read as many tuples as it needs (for example, due to Limit) it may detach the queue connecting it to the worker before reading all of the worker's tuples. Rather than let the worker continue to generate and send all of the results, have it stop after sending the next tuple. More could be done here to stop the worker even quicker, but this is about as well as we can hope to do for 9.6. This is in response to a problem report from Andreas Seltenreich. Commit `44339b892a` should be actually be sufficient to fix that example even without this change, but it seems better to do this, too, since we might otherwise waste quite a large amount of effort in one or more workers. Discussion: CAA4eK1KOKGqmz9bGu+Z42qhRwMbm4R5rfnqsLCNqFs9j14jzEA@mail.gmail.com Amit Kapila	2016-06-06 14:52:58 -04:00
Robert Haas	44339b892a	shm_mq: After a send fails with SHM_MQ_DETACHED, later ones should too. Prior to this patch, it was occasionally possible, after shm_mq_sendv had previously returned SHM_MQ_DETACHED, for a later shm_mq_sendv operation to fail an assertion instead of just again returning SHM_MQ_ATTACHED. From the shm_mq code's point of view, it was expecting to be called again with the same arguments, since the previous operation had only partially completed. However, a caller who isn't using non-blocking mode won't be prepared to repeat the call with the same arguments, and this code shouldn't expect that they will. Repair in such a way that we'll be OK whether the next call uses the same arguments or not. Found by Andreas Seltenreich. Analysis and sketch of fix by Amit Kapila. Patch by me, reviewed by Amit Kapila.	2016-06-06 14:35:30 -04:00
Robert Haas	932b97a011	Fix typo. Jim Nasby	2016-06-06 07:58:50 -04:00
Peter Eisentraut	6201a8ef3a	Fix whitespace	2016-06-05 17:02:56 -04:00
Tom Lane	8a859691d5	Properly initialize SortSupport for ORDER BY rechecks in nodeIndexscan.c. Fix still another bug in commit 35fcb1b3d: it failed to fully initialize the SortSupport states it introduced to allow the executor to re-check ORDER BY expressions containing distance operators. That led to a null pointer dereference if the sortsupport code tried to use ssup_cxt. The problem only manifests in narrow cases, explaining the lack of previous field reports. It requires a GiST-indexable distance operator that lacks SortSupport and is on a pass-by-ref data type, which among core+contrib seems to be only btree_gist's interval opclass; and it requires the scan to be done as an IndexScan not an IndexOnlyScan, which explains how btree_gist's regression test didn't catch it. Per bug #14134 from Jihyun Yu. Peter Geoghegan Report: <20160511154904.2603.43889@wrigleys.postgresql.org>	2016-06-05 11:53:06 -04:00
Tom Lane	05104f6936	Fix grammar's AND/OR flattening to work with operator_precedence_warning. It'd be good for "(x AND y) AND z" to produce a three-child AND node whether or not operator_precedence_warning is on, but that failed to happen when it's on because makeAndExpr() didn't look through the added AEXPR_PAREN node. This has no effect on generated plans because prepqual.c would flatten the AND nest anyway; but it does affect the number of parens printed in ruleutils.c, for example. I'd already fixed some similar hazards in parse_expr.c in commit `abb164655`, but didn't think to search gram.y for problems of this ilk. Per gripe from Jean-Pierre Pelletier. Report: <fa0535ec6d6428cfec40c7e8a6d11156@mail.gmail.com>	2016-06-03 19:12:29 -04:00
Tom Lane	d50183c578	Inline the easy cases in MakeExpandedObjectReadOnly(). This attempts to buy back some of whatever performance we lost from fixing bug #14174 by inlining the initial checks in MakeExpandedObjectReadOnly() into the callers. We can do that in a macro without creating multiple- evaluation hazards, so it's pretty much free notationally; and the amount of code added to callers should be minimal as well. (Testing a value can't take many more instructions than passing it to a subroutine.) Might as well inline DatumIsReadWriteExpandedObject() while we're at it. This is an ABI break for callers, so it doesn't seem safe to put into 9.5, but I see no reason not to do it in HEAD.	2016-06-03 18:34:05 -04:00
Tom Lane	9eaf5be506	Mark read/write expanded values as read-only in ValuesNext(), too. Further thought about bug #14174 motivated me to try the case of a R/W datum being returned from a VALUES list, and sure enough it was broken. Fix that. Also add a regression test case exercising the same scenario for FunctionScan. That's not broken right now, because the function's result will get shoved into a tuplestore between generation and use; but it could easily become broken whenever we get around to optimizing FunctionScan better. There don't seem to be any other places where we put the result of expression evaluation into a virtual tuple slot that could then be the source for Vars of further expression evaluation, so I think this is the end of this bug.	2016-06-03 18:07:14 -04:00
Tom Lane	69f526aa49	Mark read/write expanded values as read-only in ExecProject(). If a plan node output expression returns an "expanded" datum, and that output column is referenced in more than one place in upper-level plan nodes, we need to ensure that what is returned is a read-only reference not a read/write reference. Otherwise one of the referencing sites could scribble on or even delete the expanded datum before we have evaluated the others. Commit `1dc5ebc907`, which introduced this feature, supposed that it'd be sufficient to make SubqueryScan nodes force their output columns to read-only state. The folly of that was revealed by bug #14174 from Andrew Gierth, and really should have been immediately obvious considering that the planner will happily optimize SubqueryScan nodes out of the plan without any regard for this issue. The safest fix seems to be to make ExecProject() force its results into read-only state; that will cover every case where a plan node returns expression results. Actually we can delegate this to ExecTargetList() since we can recursively assume that plain Vars will not reference read-write datums. That should keep the extra overhead down to something minimal. We no longer need ExecMakeSlotContentsReadOnly(), which was introduced only in support of the idea that just a few plan node types would need to do this. In the future it would be nice to have the planner account for this problem and inject force-to-read-only expression evaluation nodes into only the places where there's a risk of multiple evaluation. That's not a suitable solution for 9.5 or even 9.6 at this point, though. Report: <20160603124628.9932.41279@wrigleys.postgresql.org>	2016-06-03 15:14:50 -04:00
Robert Haas	04ae11f62e	Remove bogus code to apply PathTargets to partial paths. The partial paths that get modified may already have been used as part of a GatherPath which appears in the path list, so modifying them is not a good idea at this stage - especially because this code has no check that the PathTarget is in fact parallel-safe. When partial aggregation is being performed, this is actually harmless because we'll end up replacing the pathtargets here with the correct ones within create_grouping_paths(). But if we've got a query tree containing only scan/join operations then this can result in incorrectly pushing down parallel-restricted target list entries. If those are, for example, references to subqueries, that can crash the server; but it's wrong in any event. Amit Kapila	2016-06-03 14:27:33 -04:00
Kevin Grittner	370a46fc01	Add new snapshot fields to serialize/deserialize functions. The "snapshot too old" condition was not being recognized when using a copied snapshot, since the original timestamp and lsn were not being passed along. Noticed when testing the combination of "snapshot too old" with parallel query execution.	2016-06-03 11:13:28 -05:00
Robert Haas	6436a853f1	Fix comment to be more accurate. Now that we skip vacuuming all-frozen pages, this comment needs updating. Masahiko Sawada	2016-06-03 11:56:57 -04:00
Greg Stark	e1623c3959	Fix various common mispellings. Mostly these are just comments but there are a few in documentation and a handful in code and tests. Hopefully this doesn't cause too much unnecessary pain for backpatching. I relented from some of the most common like "thru" for that reason. The rest don't seem numerous enough to cause problems. Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings	2016-06-03 16:08:45 +01:00
Robert Haas	fdfaccfa79	Cosmetic improvements to freeze map code. Per post-commit review comments from Andres Freund, improve variable names, comments, and in one place, slightly improve the code structure. Masahiko Sawada	2016-06-03 08:43:41 -04:00
Greg Stark	a3b30763cc	Be conservative about alignment requirements of struct epoll_event. Use MAXALIGN size/alignment to guarantee that later uses of memory are aligned correctly. E.g. epoll_event might need 8 byte alignment on some platforms, but earlier allocations like WaitEventSet and WaitEvent might not sized to guarantee that when purely using sizeof(). Found by myself while testing on an Sun Ultra 5 (Sparc IIi) with some editorializing by Andres Freund. In passing fix a couple typos in the area	2016-06-02 19:38:52 +01:00
Kevin Grittner	7392eed7c2	Fix btree mark/restore bug. Commit `2ed5b87f96` introduced a bug in mark/restore, in an attempt to optimize repeated restores to the same page. This caused an assertion failure during a merge join which fed directly from an index scan, although the impact would not be limited to that case. Revert the bad chunk of code from that commit. While investigating this bug it was discovered that a particular "paranoia" set of the mark position field would not prevent bad behavior; it would just make it harder to diagnose. Change that into an assertion, which will draw attention to any future problem in that area more directly. Backpatch to 9.5, where the bug was introduced. Bug #14169 reported by Shinta Koyanagi. Preliminary analysis by Tom Lane identified which commit caused the bug.	2016-06-02 12:23:01 -05:00
Tom Lane	22b27b4c9e	Avoid useless closely-spaced writes of statistics files. The original intent in the stats collector was that we should not write out stats data oftener than every PGSTAT_STAT_INTERVAL msec. Backends will not make requests at all if they see the existing data is newer than that, and the stats collector is supposed to disregard requests having a cutoff_time older than its most recently written data, so that close-together requests don't result in multiple writes. But the latter part of that got broken in commit `187492b6c2`, so that if two backends concurrently decide the existing stats are too old, the collector would write the data twice. (In principle the collector's logic would still merge requests as long as the second one arrives before we've actually written data ... but since the message collection loop would write data immediately after processing a single inquiry message, that never happened in practice, and in any case the window in which it might work would be much shorter than PGSTAT_STAT_INTERVAL.) To fix, improve pgstat_recv_inquiry so that it checks whether the cutoff time is too old, and doesn't add a request to the queue if so. This means that we do not need DBWriteRequest.request_time, because the decision is taken before making a queue entry. And that means that we don't really need the DBWriteRequest data structure at all; an OID list of database OIDs will serve and allow removal of some rather verbose and crufty code. In passing, improve the comments in this area, which have been rather neglected. Also change backend_read_statsfile so that it's not silently relying on MyDatabaseId to have some particular value in the autovacuum launcher process. It accidentally worked as desired because MyDatabaseId is zero in that process; but that does not seem like a dependency we want, especially with no documentation about it. Although this patch is mine, it turns out I'd rediscovered a known bug, for which Tomas Vondra had already submitted a patch that's functionally equivalent to the non-cosmetic aspects of this patch. Thanks to Tomas for reviewing this version. Back-patch to 9.3 where the bug was introduced. Prior-Discussion: <1718942738eb65c8407fcd864883f4c8@fuzzy.cz> Patch: <4625.1464202586@sss.pgh.pa.us>	2016-05-31 15:55:15 -04:00
Noah Misch	2195c5afaa	Mirror struct Aggref field order in _copyAggref(). This is cosmetic, and no supported release has the affected fields.	2016-05-31 00:01:03 -04:00
Alvaro Herrera	975ad4e602	Fix PageAddItem BRIN bug BRIN was relying on the ability to remove a tuple from an index page, then putting another tuple in the same line pointer. But PageAddItem refuses to add a tuple beyond the first free item past the last used item, and in particular, it rejects an attempt to add an item to an empty page anywhere other than the first line pointer. PageAddItem issues a WARNING and indicates to the caller that it failed, which in turn causes the BRIN calling code to issue a PANIC, so the whole sequence looks like this: WARNING: specified item offset is too large PANIC: failed to add BRIN tuple To fix, create a new function PageAddItemExtended which is like PageAddItem except that the two boolean arguments become a flags bitmap; the "overwrite" and "is_heap" boolean flags in PageAddItem become PAI_OVERWITE and PAI_IS_HEAP flags in the new function, and a new flag PAI_ALLOW_FAR_OFFSET enables the behavior required by BRIN. PageAddItem() retains its original signature, for compatibility with third-party modules (other callers in core code are not modified, either). Also, in the belt-and-suspenders spirit, I added a new sanity check in brinGetTupleForHeapBlock to raise an error if an TID found in the revmap is not marked as live by the page header. This causes it to react with "ERROR: corrupted BRIN index" to the bug at hand, rather than a hard crash. Backpatch to 9.5. Bug reported by Andreas Seltenreich as detected by his handy sqlsmith fuzzer. Discussion: https://www.postgresql.org/message-id/87mvni77jh.fsf@elite.ansel.ydns.eu	2016-05-30 14:47:22 -04:00
Tom Lane	83dbde94f7	Fix DROP ACCESS METHOD IF EXISTS. The IF EXISTS option was documented, and implemented in the grammar, but it didn't actually work for lack of support in does_not_exist_skipping(). Per bug #14160. Report and patch by Kouhei Sutou Report: <20160527070433.19424.81712@wrigleys.postgresql.org>	2016-05-27 11:03:18 -04:00
Tom Lane	9dd4178cec	Be more predictable about reporting "lock timeout" vs "statement timeout". If both timeout indicators are set when we arrive at ProcessInterrupts, we've historically just reported "lock timeout". However, some buildfarm members have been observed to fail isolationtester's timeouts test by reporting "lock timeout" when the statement timeout was expected to fire first. The cause seems to be that the process is allowed to sleep longer than expected (probably due to heavy machine load) so that the lock timeout happens before we reach the point of reporting the error, and then this arbitrary tiebreak rule does the wrong thing. We can improve matters by comparing the scheduled timeout times to decide which error to report. I had originally proposed greatly reducing the 1-second window between the two timeouts in the test cases. On reflection that is a bad idea, at least for the case where the lock timeout is expected to fire first, because that would assume that it takes negligible time to get from statement start to the beginning of the lock wait. Thus, this patch doesn't completely remove the risk of test failures on slow machines. Empirically, however, the case this handles is the one we are seeing in the buildfarm. The explanation may be that the other case requires the scheduler to take the CPU away from a busy process, whereas the case fixed here only requires the scheduler to not give the CPU back right away to a process that has been woken from a multi-second sleep (and, perhaps, has been swapped out meanwhile). Back-patch to 9.3 where the isolationtester timeouts test was added. Discussion: <8693.1464314819@sss.pgh.pa.us>	2016-05-27 10:40:20 -04:00
Tom Lane	aeb9ae6457	Disable physical tlist if any Var would need multiple sortgroupref labels. As part of upper planner pathification (commit `3fc6e2d7f5`) I redid createplan.c's approach to the physical-tlist optimization, in which scan nodes are allowed to return exactly the underlying table's columns so as to save doing a projection step at runtime. The logic was intentionally more aggressive than before about applying the optimization, which is generally a good thing, but Andres Freund found a case in which it got too aggressive. Namely, if any column is referenced more than once in the parent plan node's sorting or grouping column list, we can't optimize because then that column would need to have more than one ressortgroupref label, and we only have space for one. Add logic to detect this situation in use_physical_tlist(), and also add some error checking in apply_pathtarget_labeling_to_tlist(), which this example proves was being overly cavalier about whether what it was doing made any sense. The added test case exposes the problem only because we do not eliminate duplicate grouping keys. That might be something to fix someday, but it doesn't seem like appropriate post-beta work. Report: <20160526021235.w4nq7k3gnheg7vit@alap3.anarazel.de>	2016-05-26 14:52:30 -04:00
Tom Lane	b898eb6367	Remove option to write USING before opclass name in CREATE INDEX. Dating back to commit `f10b63923`, our grammar has allowed "USING" to optionally appear before an opclass name in CREATE INDEX (and, lately, some related places such as ON CONFLICT specifications). Nikolay Shaplov noticed that this syntax existed but wasn't documented, and proposed documenting it. But what seems like a better idea is to remove the production, thereby making the code match the docs not vice versa. This isn't our usual modus operandi for such cases, but there are a couple of good reasons to proceed this way: * So far as I can find, this syntax has never been documented anywhere. It isn't relied on by any of our own code or test cases, and there seems little reason to suppose that it's been used in the wild either. * Documenting it would mean that there would be two separate uses of USING in the CREATE INDEX syntax, the other being "USING access_method". That can lead to nothing but confusion. So, let's just remove it. On the off chance that somebody somewhere is using it, this isn't something to back-patch, but we can fix it in HEAD. Discussion: <1593237.l7oKHRpxSe@nataraj-amd64>	2016-05-25 19:11:00 -04:00
Tom Lane	52e8fc3e2e	Ensure that backends see up-to-date statistics for shared catalogs. Ever since we split the statistics collector's reports into per-database files (commit `187492b6c2`), backends have been seeing stale statistics for shared catalogs. This is because the inquiry message only prompts the collector to write the per-database file for the requesting backend's own database. Stats for shared catalogs are in a separate file for "DB 0", which didn't get updated. In normal operation this was partially masked by the fact that the autovacuum launcher would send an inquiry message at least once per autovacuum_naptime that asked for "DB 0"; so the shared-catalog stats would never be more than a minute out of date. However the problem becomes very obvious with autovacuum disabled, as reported by Peter Eisentraut. To fix, redefine the semantics of inquiry messages so that both the specified DB and DB 0 will be dumped. (This might seem a bit inefficient, but we have no good way to know whether a backend's transaction will look at shared-catalog stats, so we have to read both groups of stats whenever we request stats. Sending two inquiry messages would definitely not be better.) Back-patch to 9.3 where the bug was introduced. Report: <56AD41AC.1030509@gmx.net>	2016-05-25 17:48:15 -04:00
Tom Lane	2d2e40e3be	Fetch XIDs atomically during vac_truncate_clog(). Because vac_update_datfrozenxid() updates datfrozenxid and datminmxid in-place, it's unsafe to assume that successive reads of those values will give consistent results. Fetch each one just once to ensure sane behavior in the minimum calculation. Noted while reviewing Alexander Korotkov's patch in the same area. Discussion: <8564.1464116473@sss.pgh.pa.us>	2016-05-24 15:47:51 -04:00
Tom Lane	996d273978	Avoid consuming an XID during vac_truncate_clog(). vac_truncate_clog() uses its own transaction ID as the comparison point in a sanity check that no database's datfrozenxid has already wrapped around "into the future". That was probably fine when written, but in a lazy vacuum we won't have assigned an XID, so calling GetCurrentTransactionId() causes an XID to be assigned when otherwise one would not be. Most of the time that's not a big problem ... but if we are hard up against the wraparound limit, consuming XIDs during antiwraparound vacuums is a very bad thing. Instead, use ReadNewTransactionId(), which not only avoids this problem but is in itself a better comparison point to test whether wraparound has already occurred. Report and patch by Alexander Korotkov. Back-patch to all versions. Report: <CAPpHfdspOkmiQsxh-UZw2chM6dRMwXAJGEmmbmqYR=yvM7-s6A@mail.gmail.com>	2016-05-24 15:20:36 -04:00
Alvaro Herrera	0c7cd45b6d	Fix range check for effective_io_concurrency Commit `1aba62ec` moved the range check of that option form guc.c into bufmgr.c, but introduced a bug by changing a >= 0.0 to > 0.0, which made the value 0 no longer accepted. Put it back. Reported by Jeff Janes, diagnosed by Tom Lane	2016-05-24 14:55:34 -04:00
Tom Lane	1e0d6512e5	Fix BTREE_BUILD_STATS build. Commit `65c5fcd353` broke this by removing a header include directive that is conditionally required. Add that back to nbtree.c, with annotation to keep pgrminclude from re-breaking it. Peter Geoghegan Report: <CAM3SWZTNjHFYW_UG8bu0BnogqQ2HfsTgkzXLueuUhfTcYbu5HA@mail.gmail.com>	2016-05-23 19:41:11 -04:00
Tom Lane	eae1ad9b64	Support IndexElem in raw_expression_tree_walker(). Needed for cases in which INSERT ... ON CONFLICT appears inside a recursive CTE item. Per bug #14153 from Thomas Alton. Patch by Peter Geoghegan, slightly adjusted by me Report: <20160521232802.22598.13537@wrigleys.postgresql.org>	2016-05-23 19:23:36 -04:00
Tom Lane	465e09da63	Add support for more extensive testing of raw_expression_tree_walker(). If RAW_EXPRESSION_COVERAGE_TEST is defined, do a no-op tree walk over every basic DML statement submitted to parse analysis. If we'd had this in place earlier, bug #14153 would have been caught by buildfarm testing. The difficulty is that raw_expression_tree_walker() is only used in limited cases involving CTEs (particularly recursive ones), so it's very easy for an oversight in it to not be noticed during testing of a seemingly-unrelated feature. The type of error we can expect to catch with this is complete omission of a node type from raw_expression_tree_walker(), and perhaps also recursion into a field that doesn't contain a node tree, though that would be an unlikely mistake. It won't catch failure to add new fields that need to be recursed into, unfortunately. I'll go enable this on one or two of my own buildfarm animals once bug #14153 is dealt with. Discussion: <27861.1464040417@sss.pgh.pa.us>	2016-05-23 19:08:26 -04:00
Tom Lane	8a4930e3fa	Fix latent crash in do_text_output_multiline(). do_text_output_multiline() would fail (typically with a null pointer dereference crash) if its input string did not end with a newline. Such cases do not arise in our current sources; but it certainly could happen in future, or in extension code's usage of the function, so we should fix it. To fix, replace "eol += len" with "eol = text + len". While at it, make two cosmetic improvements: mark the input string const, and rename the argument from "text" to "txt" to dodge pgindent strangeness (since "text" is a typedef name). Even though this problem is only latent at present, it seems like a good idea to back-patch the fix, since it's a very simple/safe patch and it's not out of the realm of possibility that we might in future back-patch something that expects sane behavior from do_text_output_multiline(). Per report from Hao Lee. Report: <CAGoxFiFPAGyPAJLcFxTB5cGhTW2yOVBDYeqDugYwV4dEd1L_Ag@mail.gmail.com>	2016-05-23 14:16:40 -04:00
Teodor Sigaev	7c979c95a3	Allocate all page images at once in generic wal interface That reduces number of allocation. Per gripe from Michael Paquier and Tom Lane suggestion.	2016-05-17 22:09:22 +03:00
Teodor Sigaev	7c8345f67f	Correctly align page's images in generic wal API Page image should be MAXALIGN'ed because existing code could directly align pointers in page instead of align offset from beginning of page. Found during play with indexes as extenstion, Alexander Korotkov and me	2016-05-17 00:01:35 +03:00
Peter Eisentraut	9b7bfc3a88	sql_features: Fix typos This makes the feature names match the SQL standard. From: Alexander Law <exclusion@gmail.com>	2016-05-13 21:24:54 -04:00
Alvaro Herrera	cca2a27860	Fix bogus comments Some comments mentioned XLogReplayBuffer, but there's no such function: that was an interim name for a function that got renamed to XLogReadBufferForRedo, before commit `2c03216d83` was pushed.	2016-05-12 16:07:07 -03:00
Alvaro Herrera	bdb9e3dc1d	Fix obsolete comment	2016-05-12 15:36:51 -03:00
Tom Lane	8a13d5e6d1	Fix infer_arbiter_indexes() to not barf on system columns. While it could be argued that rejecting system column mentions in the ON CONFLICT list is an unsupported feature, falling over altogether just because the table has a unique index on OID is indubitably a bug. As far as I can tell, fixing infer_arbiter_indexes() is sufficient to make ON CONFLICT (oid) actually work, though making a regression test for that case is problematic because of the impossibility of setting the OID counter to a known value. Minor cosmetic cleanups along with the bug fix.	2016-05-11 17:06:53 -04:00
Tom Lane	26e66184d6	Fix assorted missing infrastructure for ON CONFLICT. subquery_planner() failed to apply expression preprocessing to the arbiterElems and arbiterWhere fields of an OnConflictExpr. No doubt the theory was that this wasn't necessary because we don't actually try to execute those expressions; but that's wrong, because it results in failure to match to index expressions or index predicates that are changed at all by preprocessing. Per bug #14132 from Reynold Smith. Also add pullup_replace_vars processing for onConflictWhere. Perhaps it's impossible to have a subquery reference there, but I'm not exactly convinced; and even if true today it's a failure waiting to happen. Also add some comments to other places where one or another field of OnConflictExpr is intentionally ignored, with explanation as to why it's okay to do so. Also, catalog/dependency.c failed to record any dependency on the named constraint in ON CONFLICT ON CONSTRAINT, allowing such a constraint to be dropped while rules exist that depend on it, and allowing pg_dump to dump such a rule before the constraint it refers to. The normal execution path managed to error out reasonably for a dangling constraint reference, but ruleutils.c dumped core; so in addition to fixing the omission, add a protective check in ruleutils.c, since we can't retroactively add a dependency in existing databases. Back-patch to 9.5 where this code was introduced. Report: <20160510190350.2608.48667@wrigleys.postgresql.org>	2016-05-11 16:20:23 -04:00
Alvaro Herrera	15739393e4	Fix autovacuum for shared relations The table-skipping logic in autovacuum would fail to consider that multiple workers could be processing the same shared catalog in different databases. This normally wouldn't be a problem: firstly because autovacuum workers not for wraparound would simply ignore tables in which they cannot acquire lock, and secondly because most of the time these tables are small enough that even if multiple for-wraparound workers are stuck in the same catalog, they would be over pretty quickly. But in cases where the catalogs are severely bloated it could become a problem. Backpatch all the way back, because the problem has been there since the beginning. Reported by Ondřej Světlík Discussion: https://www.postgresql.org/message-id/572B63B1.3030603%40flexibee.eu https://www.postgresql.org/message-id/572A1072.5080308%40flexibee.eu	2016-05-10 16:23:54 -03:00
Peter Eisentraut	48aaba4acf	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 17bf3e8564abf600274789fcc90e72532d5e7c05	2016-05-09 10:04:41 -04:00
Tom Lane	9eb7a0ac6b	Fix poorly-worded log message. Euler Taveira	2016-05-08 01:37:07 -04:00
Tom Lane	1a2c17f8e2	Fix pg_upgrade to not fail when new-cluster TOAST rules differ from old. This patch essentially reverts commit `4c6780fd17`, in favor of a much simpler solution for the case where the new cluster would choose to create a TOAST table but the old cluster doesn't have one: just don't create a TOAST table. The existing code failed in at least two different ways if the situation arose: (1) ALTER TABLE RESET didn't grab an exclusive lock, so that the lock sanity check in create_toast_table failed; (2) pg_upgrade did not provide a pg_type OID for the new toast table, so that the crosscheck in TypeCreate failed. While both these problems were introduced by later patches, they show that the hack being used to cause TOAST table creation is overwhelmingly fragile (and untested). I also note that before the TypeCreate crosscheck was added, the code would have resulted in assigning an indeterminate pg_type OID to the toast table, possibly causing a later OID conflict in that catalog; so that it didn't really work even when committed. If we simply don't create a TOAST table, there will only be a problem if the code tries to store a tuple that's wider than a page, and field compression isn't sufficient to get it under a page. Given that the TOAST creation threshold is intended to be about a quarter of a page, it's very hard to believe that cross-version differences in the do-we-need-a-toast- table heuristic could result in an observable problem. So let's just follow the old version's conclusion about whether a TOAST table is needed. (If we ever do change needs_toast_table() so much that this conclusion doesn't apply, we can devise a solution at that time, and hopefully do it in a less klugy way than `4c6780fd17` did.) Back-patch to 9.3, like the previous patch. Discussion: <8110.1462291671@sss.pgh.pa.us>	2016-05-06 22:05:56 -04:00
Kevin Grittner	7e3da1c473	Mitigate "snapshot too old" performance regression on NUMA Limit maintenance of time to xid mapping to once per minute. At least in the tested case this brings performance within 5% of when the feature is off, compared to several times slower without this patch. While there, fix comments and whitespace. Ants Aasma, with cosmetic adjustments suggested by Andres Freund Reviewed by Kevin Grittner and Andres Freund	2016-05-06 20:05:29 -05:00
Robert Haas	68d704edbf	Minimal fix for crash bug in quals_match_foreign_key. Discussion is still underway as to whether to revert the entire patch that added this function, but that discussion may not conclude before beta1. So, in the meantime, let's do at least this much. David Rowley	2016-05-06 15:00:55 -04:00
Robert Haas	c7ea68ff8d	Limit maximum parallel degree to 1024. This new limit affects both the max_parallel_degree GUC and the parallel_degree reloption. There may some day be a use case for using more than 1024 CPUs for a single query, but that's surely not the case right now. Not only do not very many people have that many CPUs, but the code hasn't been tested at that kind of scale and is very unlikely to perform well, or even work at all, without a lot more work. The issue addressed by commit `06bd458cb8` is probably just one problem of many. The idea of a more reasonable limit here was suggested by Tom Lane; the value of 1024 was suggested by Amit Kapila.	2016-05-06 14:50:54 -04:00
Robert Haas	06bd458cb8	Use mul_size when multiplying by the number of parallel workers. That way, if the result overflows size_t, you'll get an error instead of undefined behavior, which seems like a plus. This also has the effect of casting the number of workers from int to Size, which is better because it's harder to overflow int than size_t. Dilip Kumar reported this issue and provided a patch upon which this patch is based, but his version did use mul_size.	2016-05-06 14:32:58 -04:00
Stephen Frost	a89505fd21	Remove various special checks around default roles Default roles really should be like regular roles, for the most part. This removes a number of checks that were trying to make default roles extra special by not allowing them to be used as regular roles. We still prevent users from creating roles in the "pg_" namespace or from altering roles which exist in that namespace via ALTER ROLE, as we can't preserve such changes, but otherwise the roles are very much like regular roles. Based on discussion with Robert and Tom.	2016-05-06 14:06:50 -04:00
Tom Lane	d136d600f9	Fix possible read past end of string in to_timestamp(). to_timestamp() handles the TH/th format codes by advancing over two input characters, whatever those are. It failed to notice whether there were two characters available to be skipped, making it possible to advance the pointer past the end of the input string and keep on parsing. A similar risk existed in the handling of "Y,YYY" format: it would advance over three characters after the "," whether or not three characters were available. In principle this might be exploitable to disclose contents of server memory. But the security team concluded that it would be very hard to use that way, because the parsing loop would stop upon hitting any zero byte, and TH/th format codes can't be consecutive --- they have to follow some other format code, which would have to match whatever data is there. So it seems impractical to examine memory very much beyond the end of the input string via this bug; and the input string will always be in local memory not in disk buffers, making it unlikely that anything very interesting is close to it in a predictable way. So this doesn't quite rise to the level of needing a CVE. Thanks to Wolf Roediger for reporting this bug.	2016-05-06 12:09:20 -04:00
Kevin Grittner	2cc41acd8f	Fix hash index vs "snapshot too old" problemms Hash indexes are not WAL-logged, and so do not maintain the LSN of index pages. Since the "snapshot too old" feature counts on detecting error conditions using the LSN of a table and all indexes on it, this makes it impossible to safely do early vacuuming on any table with a hash index, so add this to the tests for whether the xid used to vacuum a table can be adjusted based on old_snapshot_threshold. While at it, add a paragraph to the docs for old_snapshot_threshold which specifically mentions this and other aspects of the feature which may otherwise surprise users. Problem reported and patch reviewed by Amit Kapila	2016-05-06 07:47:12 -05:00
Tom Lane	0b9a234432	Rename tsvector delete() to ts_delete(), and filter() to ts_filter(). The similarity of the original names to SQL keywords seems like a bad idea. Rename them before we're stuck with 'em forever. In passing, minor code and docs cleanup. Discussion: <4875.1462210058@sss.pgh.pa.us>	2016-05-05 19:43:32 -04:00
Dean Rasheed	18a02ad2a5	Fix corner-case loss of precision in numeric pow() calculation Commit `7d9a4737c2` greatly improved the accuracy of the numeric transcendental functions, however it failed to consider the case where the result from pow() is close to the overflow threshold, for example 0.12 ^ -2345.6. For such inputs, where the result has more than 2000 digits before the decimal point, the decimal result weight estimate was being clamped to 2000, leading to a loss of precision in the final calculation. Fix this by replacing the clamping code with an overflow test that aborts the calculation early if the final result is sure to overflow, based on the overflow limit in exp_var(). This provides the same protection against integer overflow in the subsequent result scale computation as the original clamping code, but it also ensures that precision is never lost and saves compute cycles in cases that are sure to overflow. The new early overflow test works with the initial low-precision result (expected to be accurate to around 8 significant digits) and includes a small fuzz factor to ensure that it doesn't kick in for values that would not overflow exp_var(), so the overall overflow threshold of pow() is unchanged and consistent for all inputs with non-integer exponents. Author: Dean Rasheed Reviewed-by: Tom Lane Discussion: http://www.postgresql.org/message-id/CAEZATCUj3U-cQj0jjoia=qgs0SjE3auroxh8swvNKvZWUqegrg@mail.gmail.com See-also: http://www.postgresql.org/message-id/CAEZATCV7w+8iB=07dJ8Q0zihXQT1semcQuTeK+4_rogC_zq5Hw@mail.gmail.com	2016-05-05 11:16:17 +01:00
Alvaro Herrera	c1543a81a7	Revert timeline following in replication slots This reverts commits `f07d18b6e9`, `82c83b3372`, `3a3b309041`, and `24c5f1a103`. This feature has shown enough immaturity that it was deemed better to rip it out before rushing some more fixes at the last minute. There are discussions on larger changes in this area for the next release.	2016-05-04 17:32:22 -03:00
Teodor Sigaev	4bbc1a7ea3	Fix crash of filter(tsvector) Variable storing a position of lexeme, had a wrong type: char, it's obviously not enough to store 2^14 possible positions. Stas Kelvich	2016-05-04 17:58:08 +03:00
Andres Freund	a712487087	Fix transient mdsync() errors of truncated relations due to `72a98a6395`. Unfortunately the segment size checks from `72a98a6395` had the negative side-effect of breaking a corner case in mdsync(): When processing a fsync request for a truncated away segment mdsync() could fail with "could not fsync file" (if previous segment < RELSEG_SIZE) because _mdfd_getseg() now wouldn't return the relevant segment anymore. The cleanest fix seems to be to allow the caller of _mdfd_getseg() to specify whether checks for RELSEG_SIZE are performed. To allow doing so, change the ExtensionBehavior enum into a bitmask. Besides allowing for the addition of EXTENSION_DONT_CHECK_SIZE, this makes for a nicer implementation of EXTENSION_REALLY_RETURN_NULL. Besides mdsync() the only callsite that should change behaviour due to this is mdprefetch() which now doesn't create segments anymore, even in recovery. Given the uses of mdprefetch() that seems better. Reported-By: Thom Brown Discussion: CAA-aLv72QazLvPdKZYpVn4a_Eh+i4_cxuB03k+iCuZM_xjc+6Q@mail.gmail.com	2016-05-04 01:54:20 -07:00
Robert Haas	9888b34fdb	Fix more things to be parallel-safe. Conversion functions were previously marked as parallel-unsafe, since that is the default, but in fact they are safe. Parallel-safe functions defined in pg_proc.h and redefined in system_views.sql were ending up as parallel-unsafe because the redeclarations were not marked PARALLEL SAFE. While editing system_views.sql, mark ts_debug() parallel safe also. Andreas Karlsson	2016-05-03 14:36:38 -04:00
Robert Haas	8826d85078	Tweak a few more things in preparation for upcoming pgindent run. These adjustments adjust code and comments in minor ways to prevent pgindent from mangling them. Among other things, I tried to avoid situations where pgindent would emit "a +b" instead of "a + b", and I tried to avoid having it break up inline comments across multiple lines.	2016-05-03 10:52:25 -04:00
Robert Haas	1e77949e67	Note that max_worker_processes requires restart. Since this is a minor issue, no back-patch. Julien Rouhaud	2016-05-03 10:39:21 -04:00
Alvaro Herrera	234a266066	Fix code comments regarding logical decoding Back in `3b02ea4f07` I added some comments in various places to explain how logical decoding and other things worked. Not all of the changes were welcome, because they were misleading or wrong. This changes them a little bit to make them more accurate. Some other comments are also changed to be more accurate. Also, fix a bunch of typos. Author: Álvaro Herrera, Craig Ringer Andres Freund reviewed some parts of this.	2016-05-02 16:04:29 -03:00
Robert Haas	37d0c2cb1a	Fix parallel safety markings for pg_start_backup. Commit `7117685461` made pg_start_backup parallel-restricted rather than parallel-safe, because it now relies on backend-private state that won't be synchronized with the parallel worker. However, it didn't update pg_proc.h. Separately, Andreas Karlsson observed that system_views.sql neglected to reiterate the parallel-safety markings whe redefining various functions, including this one; so add a PARALLEL RESTRICTED declaration there to match the new value in pg_proc.h.	2016-05-02 10:42:34 -04:00
Tom Lane	2a2435e699	Small improvements to OPTIMIZER_DEBUG code. Now that Paths have their own rows field, print that rather than the parent relation's rowcount. Show the relid sets associated with Paths using table names rather than numbers; since this code is able to print simple Var references using table names, it seems a bit silly that print_relids can't. Print the cheapest_parameterized_paths list for a RelOptInfo, and include information about a parameterized path's required_outer rels. Noted while trying to use this feature to debug Alexander Kirkouski's recent bug report.	2016-04-30 14:08:00 -04:00
Tom Lane	c45bf5751b	Fix planner crash from pfree'ing a partial path that a GatherPath uses. We mustn't run generate_gather_paths() during add_paths_to_joinrel(), because that function can be invoked multiple times for the same target joinrel. Not only is it wasteful to build GatherPaths repeatedly, but a later add_partial_path() could delete the partial path that a previously created GatherPath depends on. Instead establish the convention that we do generate_gather_paths() for a rel only just before set_cheapest(). The code was accidentally not broken for baserels, because as of today there never is more than one partial path for a baserel. But that assumption obviously has a pretty short half-life, so move the generate_gather_paths() calls for those cases as well. Also add some generic comments explaining how and why this all works. Per fuzz testing by Andreas Seltenreich. Report: <871t5pgwdt.fsf@credativ.de>	2016-04-30 12:29:21 -04:00
Tom Lane	17d5db352c	Remove warning about num_sync being too large in synchronous_standby_names. If we're not going to reject such setups entirely, throwing a WARNING in check_synchronous_standby_names() is unhelpful, because it will cause the warning to be logged again every time the postmaster receives SIGHUP. Per discussion, just remove the warning. In passing, improve the documentation for synchronous_commit, which had not gotten the word that now there can be more than one synchronous standby.	2016-04-30 10:54:45 -04:00
Tom Lane	207d5a656e	Fix mishandling of equivalence-class tests in parameterized plans. Given a three-or-more-way equivalence class, such as X.Y = Y.Y = Z.Z, it was possible for the planner to omit one of the quals needed to enforce that all members of the equivalence class are actually equal. This only happened in the case of a parameterized join node for two of the relations, that is a plan tree like Nested Loop -> Scan X -> Nested Loop -> Scan Y -> Scan Z Filter: Z.Z = X.X The eclass machinery normally expects to apply X.X = Y.Y when those two relations are joined, but in this shape of plan tree they aren't joined until the top node --- and, if the lower nested loop is marked as parameterized by X, the top node will assume that the relevant eclass condition(s) got pushed down into the lower node. On the other hand, the scan of Z assumes that it's only responsible for constraining Z.Z to match any one of the other eclass members. So one or another of the required quals sometimes fell between the cracks, depending on whether consideration of the eclass in get_joinrel_parampathinfo() for the lower nested loop chanced to generate X.X = Y.Y or X.X = Z.Z as the appropriate constraint there. If it generated the latter, it'd erroneously suppose that the Z scan would take care of matters. To fix, force X.X = Y.Y to be generated and applied at that join node when this case occurs. This is extremely hard to hit in practice, because various planner behaviors conspire to mask the problem; starting with the fact that the planner doesn't really like to generate a parameterized plan of the above shape. (It might have been impossible to hit it before we tweaked things to allow this plan shape for star-schema cases.) Many thanks to Alexander Kirkouski for submitting a reproducible test case. The bug can be demonstrated in all branches back to 9.2 where parameterized paths were introduced, so back-patch that far.	2016-04-29 20:19:38 -04:00
Kevin Grittner	7c3e8039f4	Add a few entries to the tail of time mapping, to see old values. Without a few entries beyond old_snapshot_threshold, the lookup would often fail, resulting in the more aggressive pruning or vacuum being skipped often enough to matter. This was very clearly shown by a python test script posted by Ants Aasma, and was likely a factor in an earlier but somewhat less clear-cut test case posted by Jeff Janes. This patch makes no change to the logic, per se -- it just makes the array of mapping entries big enough to make lookup misses based on timing much less likely. An occasional miss is still possible if a thread stalls for more than 10 minutes, but that does not create any problem with correctness of behavior. Besides, if things are so busy that a thread is stalling for more than 10 minutes, it is probably OK to skip the more aggressive cleanup at that particular point in time.	2016-04-29 16:46:08 -05:00
Andrew Dunstan	0fb54de9aa	Support building with Visual Studio 2015 Adjust the way we detect the locale. As a result the minumum Windows version supported by VS2015 and later is Windows Vista. Add some tweaks to remove new compiler warnings. Remove documentation references to the now obsolete msysGit. Michael Paquier, somewhat edited by me, reviewed by Christian Ullrich. Backpatch to 9.5	2016-04-29 08:09:07 -04:00
Andres Freund	59455018a8	Remember asking for feedback during walsender shutdown. Since `5a991ef8` we're explicitly asking for feedback from the receiving side when shutting down walsender, if there's not yet replicated data. Unfortunately we didn't remember (i.e. set waiting_for_ping_response to true) having asked for feedback, leading to scenarios in which replies were requested at a high frequency. I can't reproduce this problem on my laptop, I think that's because the problem requires a significant TCP window to manifest due to the !pq_is_send_pending() condition. But since this clearly is a bug, let's fix it. There's quite possibly more wrong than just this though. While fiddling with WalSndDone(), I rewrote a hard to understand comment about looking at the flush vs. the write position. Reported-By: Nick Cleaton, Magnus Hagander Author: Nick Cleaton Discussion: CAFgz3kus=rC_avEgBV=+hRK5HYJ8vXskJRh8yEAbahJGTzF2VQ@mail.gmail.com CABUevExsjROqDcD0A2rnJ6HK6FuKGyewJr3PL12pw85BHFGS2Q@mail.gmail.com Backpatch: 9.4, were `5a991ef8` introduced the use of feedback messages during shutdown.	2016-04-28 22:11:18 -07:00
Teodor Sigaev	f8467f7da8	Prevent to use magic constants Use macroses for definition amstrategies/amsupport fields instead of hardcoded values. Author: Nikolay Shaplov with addition for contrib/bloom	2016-04-28 16:39:25 +03:00
Teodor Sigaev	e2c79e14d9	Prevent multiple cleanup process for pending list in GIN. Previously, ginInsertCleanup could exit early if it detects that someone else is cleaning up the pending list, without waiting for that someone else to finish the job. But in this case vacuum could miss tuples to be deleted. Cleanup process now locks metapage with a help of heavyweight LockPage(ExclusiveLock), and it guarantees that there is no another cleanup process at the same time. Lock is taken differently depending on caller of cleanup process: any vacuums and gin_clean_pending_list() will be blocked until lock becomes available, ordinary insert uses conditional lock to prevent indefinite waiting on lock. Insert into pending list doesn't use this lock, so insertion isn't blocked. Also, patch adds stopping of cleanup process when at-start-cleanup-tail is reached in order to prevent infinite cleanup in case of massive insertion. But it will stop only for automatic maintenance tasks like autovacuum. Patch introduces choice of limit of memory to use: autovacuum_work_mem, maintenance_work_mem or work_mem depending on call path. Patch for previous releases should be reworked due to changes between 9.6 and previous ones in this area. Discover and diagnostics by Jeff Janes and Tomas Vondra Patch by me with some ideas of Jeff Janes	2016-04-28 16:21:42 +03:00
Tom Lane	4c804fbdfb	Clean up parsing of synchronous_standby_names GUC variable. Commit `989be0810d` added a flex/bison lexer/parser to interpret synchronous_standby_names. It was done in a pretty crufty way, though, making assorted end-use sites responsible for calling the parser at the right times. That was not only vulnerable to errors of omission, but made it possible that lexer/parser errors occur at very undesirable times, and created memory leakages even if there was no error. Instead, perform the parsing once during check_synchronous_standby_names and let guc.c manage the resulting data. To do that, we have to flatten the parsed representation into a single hunk of malloc'd memory, but that is not very hard. While at it, work a little harder on making useful error reports for parsing problems; the previous code felt that "synchronous_standby_names parser returned 1" was an appropriate user-facing error message. (To be fair, it did also log a syntax error message, but separately from the GUC problem report, which is at best confusing.) It had some outright bugs in the face of invalid input, too. I (tgl) also concluded that we need to restrict unquoted names in synchronous_standby_names to be just SQL identifiers. The previous coding would accept darn near anything, which (1) makes the quoting convention both nearly-unnecessary and formally ambiguous, (2) makes it very hard to understand what is a syntax error and what is a creative interpretation of the input as a standby name, and (3) makes it impossible to further extend the syntax in future without a compatibility break. I presume that we're intending future extensions of the syntax, else this parsing infrastructure is massive overkill, so (3) is an important objection. Since we've taken a compatibility hit for non-identifier names with this change anyway, we might as well lock things down now and insist that users use double quotes for standby names that aren't identifiers. Kyotaro Horiguchi and Tom Lane	2016-04-27 17:55:25 -04:00
Robert Haas	372ff7cae2	Fix wrong word. Commit `a31212b429` was a little too hasty. Per report from Tom Lane.	2016-04-27 14:23:56 -04:00
Robert Haas	a31212b429	Change postgresql.conf.sample to say that fsync=off will corrupt data. Discussion: 24748.1461764666@sss.pgh.pa.us Per a suggestion from Craig Ringer. This wording from Tom Lane, following discussion.	2016-04-27 13:47:07 -04:00
Robert Haas	cf402ba734	Tighten up sanity checks for parallel aggregate in execQual.c. David Rowley	2016-04-27 12:05:35 -04:00
Robert Haas	b33dc77665	Remove inadvertently commited vim swapfile. If you were wondering what editor I use, now you know.	2016-04-27 11:53:01 -04:00
Robert Haas	8126eaee2f	Clean up a few parallelism-related things that pgindent wants to mangle. In nodeFuncs.c, pgindent wants to introduce spurious indentation into the definitions of planstate_tree_walker and planstate_walk_subplans. Fix that by spreading the definition out across several lines, similar to what is already done for other walker functions in that file. In execParallel.c, in the definition of SharedExecutorInstrumentation, pgindent wants to insert more whitespace between the type name and the member name. That causes it to mangle comments later on the line. Fix by moving the comments out of line. Now that we have a bit more room, add some more details that may be useful to the next person reading this code.	2016-04-27 11:29:45 -04:00
Robert Haas	360ca27a9b	Remove mergeHyperLogLog. It's buggy. If somebody needs this later, they'll need to put back a non-buggy vesion of it. Discussion: CAM3SWZT-i6R9JU5YXa8MJUou2_r3LfGJZpQ9tYa1BYxfkj0=cQ@mail.gmail.com Discussion: CAM3SWZRUOLsYoTT83QgdUy9D8ehYWm_nvbrrfcOOzikiRfFY7g@mail.gmail.com Peter Geoghegan	2016-04-27 10:55:32 -04:00
Robert Haas	59eb551279	Fix EXPLAIN VERBOSE output for parallel aggregate. The way that PartialAggregate and FinalizeAggregate plan nodes were displaying output columns before was bogus. Now, FinalizeAggregate produces the same outputs as an Aggregate would have produced, while PartialAggregate produces each of those outputs prefixed by the word PARTIAL. Discussion: 12585.1460737650@sss.pgh.pa.us Patch by me, reviewed by David Rowley.	2016-04-27 07:37:40 -04:00
Andres Freund	72a98a6395	Don't open formally non-existent segments in _mdfd_getseg(). Before this commit _mdfd_getseg(), in contrast to mdnblocks(), did not verify whether all segments leading up to the to-be-opened one, were RELSEG_SIZE sized. That is e.g. not the case after truncating a relation, because later segments just get truncated to zero length, not removed. Once a "non-existent" segment has been opened in a session, mdnblocks() will return wrong results, causing errors like "could not read block %u in file" when accessing blocks. Closing the session, or the later arrival of relevant invalidation messages, would "fix" the problem. That, so far, was mostly harmless, because most segment accesses are only done after an mdnblocks() call. But since `428b1d6b29` we try to open segments that might have been deleted, to trigger kernel writeback from a backend's queue of recent writes. To fix check segment sizes in _mdfd_getseg() when opening previously unopened segments. In practice this shouldn't imply a lot of additional lseek() calls, because mdnblocks() will most of the time already have opened all relevant segments. This commit also fixes a second problem, namely that _mdfd_getseg( EXTENSION_RETURN_NULL) extends files during recovery, which is not desirable for the mdwriteback() case. Add EXTENSION_REALLY_RETURN_NULL, which does not behave that way, and use it. Reported-By: Thom Brown Author: Andres Freund, Abhijit Menon-Sen Reviewd-By: Robert Haas, Fabien Coehlo Discussion: CAA-aLv6Dp_ZsV-44QA-2zgkqWKQq=GedBX2dRSrWpxqovXK=Pg@mail.gmail.com Fixes: `428b1d6b29`	2016-04-26 20:32:51 -07:00
Andres Freund	c6ff84b06a	Emit invalidations to standby for transactions without xid. So far, when a transaction with pending invalidations, but without an assigned xid, committed, we simply ignored those invalidation messages. That's problematic, because those are actually sent for a reason. Known symptoms of this include that existing sessions on a hot-standby replica sometimes fail to notice new concurrently built indexes and visibility map updates. The solution is to WAL log such invalidations in transactions without an xid. We considered to alternatively force-assign an xid, but that'd be problematic for vacuum, which might be run in systems with few xids. Important: This adds a new WAL record, but as the patch has to be back-patched, we can't bump the WAL page magic. This means that standbys have to be updated before primaries; otherwise "PANIC: standby_redo: unknown op code 32" errors can be encountered. XXX: Reported-By: Васильев Дмитрий, Masahiko Sawada Discussion: CAB-SwXY6oH=9twBkXJtgR4UC1NqT-vpYAtxCseME62ADwyK5OA@mail.gmail.com CAD21AoDpZ6Xjg=gFrGPnSn4oTRRcwK1EBrWCq9OqOHuAcMMC=w@mail.gmail.com	2016-04-26 20:21:54 -07:00
Robert Haas	2ac3be2e76	Fix pg_get_functiondef to dump parallel-safety markings. Ashutosh Sharma	2016-04-26 22:56:27 -04:00
Tom Lane	82311bcdd7	Yet more portability hacking for degree-based trig functions. The true explanation for Peter Eisentraut's report of inexact asind results seems to be that (a) he's compiling into x87 instruction set, which uses wider-than-double float registers, plus (b) the library function asin() on his platform returns a result that is wider than double and is not rounded to double width. To fix, we have to force the function's result to be rounded comparably to what happened to the scaling constant asin_0_5. Experimentation suggests that storing it into a volatile local variable is the least ugly way of making that happen. Although only asin() is known to exhibit an observable inexact result, we'd better do this in all the places where we're hoping to get an exact result by scaling.	2016-04-26 11:24:15 -04:00
Robert Haas	77cd477c4b	Enable parallel query by default. Change max_parallel_degree default from 0 to 2. It is possible that this is not a good idea, or that we should go with 1 worker rather than 2, but we won't find out without trying it. Along the way, reword the documentation for max_parallel_degree a little bit to hopefully make it more clear. Discussion: 20160420174631.3qjjhpwsvvx5bau5@alap3.anarazel.de	2016-04-26 08:35:58 -04:00
Magnus Hagander	b7351ced42	Fix typo in comment Author: Daniel Gustafsson	2016-04-26 10:38:32 +02:00
Kevin Grittner	e65953be4f	Fix C comment typo and redundant test	2016-04-25 15:42:29 -05:00
Tom Lane	6b1a213bbd	New method for preventing compile-time calculation of degree constants. Commit `65abaab547` tried to prevent the scaling constants used in the degree-based trig functions from being precomputed at compile time, because some compilers do that with functions that don't yield results identical-to-the-last-bit to what you get at runtime. A report from Peter Eisentraut suggests that some recent compilers are smart enough to see through that trick, though. Instead, let's put the inputs to these calculations into non-const global variables, which should be a more reliable way of convincing the compiler that it can't assume that they are compile-time constants. (If we really get desperate, we could mark these variables "volatile", but I do not believe we should have to.)	2016-04-25 15:21:04 -04:00
Andres Freund	8f91d87d43	Fix documentation & config inconsistencies around `428b1d6b2`. Several issues: 1) checkpoint_flush_after doc and code disagreed about the default 2) new GUCs were missing from postgresql.conf.sample 3) Outdated source-code comment about bgwriter_flush_after's default 4) Sub-optimal categories assigned to new GUCs 5) Docs suggested backend_flush_after is PGC_SIGHUP, but it's PGC_USERSET. 6) Spell out int as integer in the docs, as done elsewhere Reported-By: Magnus Hagander, Fujii Masao Discussion: CAHGQGwETyTG5VYQQ5C_srwxWX7RXvFcD3dKROhvAWWhoSBdmZw@mail.gmail.com	2016-04-24 12:26:55 -07:00
Tom Lane	0ab3595e5b	Rename strtoi() to strtoint(). NetBSD has seen fit to invent a libc function named strtoi(), which conflicts with the long-established static functions of the same name in datetime.c and ecpg's interval.c. While muttering darkly about intrusions on application namespace, we'll rename our functions to avoid the conflict. Back-patch to all supported branches, since this would affect attempts to build any of them on recent NetBSD. Thomas Munro	2016-04-23 16:53:15 -04:00
Bruce Momjian	915cee4595	Properly mark initRectBox() as taking 'void' args Was part of box type in SP-GiST index patch. Reported-by: Emre Hasegeli	2016-04-23 10:41:11 -04:00
Tom Lane	abb164655c	Fix unexpected side-effects of operator_precedence_warning. The implementation of that feature involves injecting nodes into the raw parsetree where explicit parentheses appear. Various places in parse_expr.c that test to see "is this child node of type Foo" need to look through such nodes, else we'll get different behavior when operator_precedence_warning is on than when it is off. Note that we only need to handle this when testing untransformed child nodes, since the AEXPR_PAREN nodes will be gone anyway after transformExprRecurse. Per report from Scott Ribe and additional code-reading. Back-patch to 9.5 where this feature was added. Report: <ED37E303-1B0A-4CD8-8E1E-B9C4C2DD9A17@elevated-dev.com>	2016-04-21 23:17:36 -04:00
Tom Lane	80f66a9ad0	Fix planner failure with full join in RHS of left join. Given a left join containing a full join in its righthand side, with the left join's joinclause referencing only one side of the full join (in a non-strict fashion, so that the full join doesn't get simplified), the planner could fail with "failed to build any N-way joins" or related errors. This happened because the full join was seen as overlapping the left join's RHS, and then recent changes within join_is_legal() caused that function to conclude that the full join couldn't validly be formed. Rather than try to rejigger join_is_legal() yet more to allow this, I think it's better to fix initsplan.c so that the required join order is explicit in the SpecialJoinInfo data structure. The previous coding there essentially ignored full joins, relying on the fact that we don't flatten them in the joinlist data structure to preserve their ordering. That's sufficient to prevent a wrong plan from being formed, but as this example shows, it's not sufficient to ensure that the right plan will be formed. We need to work a bit harder to ensure that the right plan looks sane according to the SpecialJoinInfos. Per bug #14105 from Vojtech Rylko. This was apparently induced by commit `8703059c6` (though now that I've seen it, I wonder whether there are related cases that could have failed before that); so back-patch to all active branches. Unfortunately, that patch also went into 9.0, so this bug is a regression that won't be fixed in that branch.	2016-04-21 20:05:58 -04:00
Tom Lane	125ad539a2	Improve TranslateSocketError() to handle more Windows error codes. The coverage was rather lean for cases that bind() or listen() might return. Add entries for everything that there's a direct equivalent for in the set of Unix errnos that elog.c has heard of.	2016-04-21 16:58:47 -04:00
Tom Lane	1f7c85b820	Fix ruleutils.c's dumping of ScalarArrayOpExpr containing an EXPR_SUBLINK. When we shoehorned "x op ANY (array)" into the SQL syntax, we created a fundamental ambiguity as to the proper treatment of a sub-SELECT on the righthand side: perhaps what's meant is to compare x against each row of the sub-SELECT's result, or perhaps the sub-SELECT is meant as a scalar sub-SELECT that delivers a single array value whose members should be compared against x. The grammar resolves it as the former case whenever the RHS is a select_with_parens, making the latter case hard to reach --- but you can get at it, with tricks such as attaching a no-op cast to the sub-SELECT. Parse analysis would throw away the no-op cast, leaving a parsetree with an EXPR_SUBLINK SubLink directly under a ScalarArrayOpExpr. ruleutils.c was not clued in on this fine point, and would naively emit "x op ANY ((SELECT ...))", which would be parsed as the first alternative, typically leading to errors like "operator does not exist: text = text[]" during dump/reload of a view or rule containing such a construct. To fix, emit a no-op cast when dumping such a parsetree. This might well be exactly what the user wrote to get the construct accepted in the first place; and even if she got there with some other dodge, it is a valid representation of the parsetree. Per report from Karl Czajkowski. He mentioned only a case involving RLS policies, but actually the problem is very old, so back-patch to all supported branches. Report: <20160421001832.GB7976@moraine.isi.edu>	2016-04-21 14:20:30 -04:00
Robert Haas	c4a586c486	Prevent possible crash reading pg_stat_activity. Also, avoid reading PGPROC's wait_event field twice, once for the wait event and again for the wait_event_type, because the value might change in the middle. Petr Jelinek and Robert Haas	2016-04-21 14:02:15 -04:00
Robert Haas	36f69faeff	Comment improvements for ForeignPath. It's not necessarily just scanning a base relation any more. Amit Langote and Etsuro Fujita	2016-04-21 13:30:48 -04:00
Robert Haas	9f84280ae9	Fix assorted defects in `09adc9a8c0`. That commit increased all shared memory allocations to the next higher multiple of PG_CACHE_LINE_SIZE, but it didn't ensure that allocation started on a cache line boundary. It also failed to remove a couple other pieces of now-useless code. BUFFERALIGN() is perhaps obsolete at this point, and likely should be removed at some point, too, but that seems like it can be left to a future cleanup. Mistakes all pointed out by Andres Freund. The patch is mine, with a few extra assertions which I adopted from his version of this fix.	2016-04-21 13:27:41 -04:00
Kevin Grittner	11e178d0dc	Inline initial comparisons in TestForOldSnapshot() Even with old_snapshot_threshold = -1 (which disables the "snapshot too old" feature), performance regressions were seen at moderate to high concurrency. For example, a one-socket, four-core system running 200 connections at saturation could see up to a 2.3% regression, with larger regressions possible on NUMA machines. By inlining the early (smaller, faster) tests in the TestForOldSnapshot() function, the i7 case dropped to a 0.2% regression, which could easily just be noise, and is clearly an improvement. Further testing will show whether more is needed.	2016-04-21 08:40:08 -05:00

... 2 3 4 5 6 ...

16246 Commits