postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-10-06 11:46:54 +02:00

Author	SHA1	Message	Date
Alvaro Herrera	e3ad3ffa68	Fix handling of multixacts predating pg_upgrade After pg_upgrade, it is possible that some tuples' Xmax have multixacts corresponding to the old installation; such multixacts cannot have running members anymore. In many code sites we already know not to read them and clobber them silently, but at least when VACUUM tries to freeze a multixact or determine whether one needs freezing, there's an attempt to resolve it to its member transactions by calling GetMultiXactIdMembers, and if the multixact value is "in the future" with regards to the current valid multixact range, an error like this is raised: ERROR: MultiXactId 123 has not been created yet -- apparent wraparound and vacuuming fails. Per discussion with Andrew Gierth, it is completely bogus to try to resolve multixacts coming from before a pg_upgrade, regardless of where they stand with regards to the current valid multixact range. It's possible to get from under this problem by doing SELECT FOR UPDATE of the problem tuples, but if tables are large, this is slow and tedious, so a more thorough solution is desirable. To fix, we realize that multixacts in xmax created in 9.2 and previous have a specific bit pattern that is never used in 9.3 and later (we already knew this, per comments and infomask tests sprinkled in various places, but we weren't leveraging this knowledge appropriately). Whenever the infomask of the tuple matches that bit pattern, we just ignore the multixact completely as if Xmax wasn't set; or, in the case of tuple freezing, we act as if an unwanted value is set and clobber it without decoding. This guarantees that no errors will be raised, and that the values will be progressively removed until all tables are clean. Most callers of GetMultiXactIdMembers are patched to recognize directly that the value is a removable "empty" multixact and avoid calling GetMultiXactIdMembers altogether. To avoid changing the signature of GetMultiXactIdMembers() in back branches, we keep the "allow_old" boolean flag but rename it to "from_pgupgrade"; if the flag is true, we always return an empty set instead of looking up the multixact. (I suppose we could remove the argument in the master branch, but I chose not to do so in this commit). This was broken all along, but the error-facing message appeared first because of commit `8e9a16ab8f` and was partially fixed in `a25c2b7c4d`. This fix, backpatched all the way back to 9.3, goes approximately in the same direction as `a25c2b7c4d` but should cover all cases. Bug analysis by Andrew Gierth and Álvaro Herrera. A number of public reports match this bug: https://www.postgresql.org/message-id/20140330040029.GY4582@tamriel.snowman.net https://www.postgresql.org/message-id/538F3D70.6080902@publicrelay.com https://www.postgresql.org/message-id/556439CF.7070109@pscs.co.uk https://www.postgresql.org/message-id/SG2PR06MB0760098A111C88E31BD4D96FB3540@SG2PR06MB0760.apcprd06.prod.outlook.com https://www.postgresql.org/message-id/20160615203829.5798.4594@wrigleys.postgresql.org	2016-06-24 18:29:28 -04:00
Robert Haas	9e9c38e159	postgres_fdw: Fix incorrect NULL handling in join pushdown. something.* IS NOT NULL means that every attribute of the row is not NULL, not that the row itself is non-NULL (e.g. because it's coming from below an outer join. Use (somevar.*)::pg_catalog.text IS NOT NULL instead. Ashutosh Bapat, per a report by Rushabh Lathia. Reviewed by Amit Langote and Etsuro Fujita. Schema-qualification added by me.	2016-06-24 15:14:15 -04:00
Robert Haas	267569b24c	postgres_fdw: Remove useless return statement. Etsuro Fujita	2016-06-24 14:33:13 -04:00
Tom Lane	e611515dd6	pg_trgm's set_limit() function is parallel unsafe, not parallel restricted. Per buildfarm. Fortunately, it's not quite too late to squeeze this fix into the pg_trgm 1.3 update.	2016-06-20 11:29:54 -04:00
Tom Lane	9c852566a3	Fix comparison of similarity to threshold in GIST trigram searches. There was some very strange code here, dating to commit `b525bf77`, that purported to work around an ancient gcc bug by forcing a float4 comparison to be done as int instead. Commit `5871b8848` broke that when it changed one side of the comparison to "double" but left the comparison code alone. Commit `f576b17cd` doubled down on the weirdness by introducing a "volatile" marker, which had nothing to do with the actual problem. Guess that the gcc bug, even if it's still present in the wild, was triggered by comparison of float4's and can be avoided if we store the result of cnt_sml() into a double before comparing to the double "nlimit". This will at least work correctly on non-broken compilers, and it's way more readable. Per bug #14202 from Greg Navis. Add a regression test based on his example. Report: <20160620115321.5792.10766@wrigleys.postgresql.org>	2016-06-20 10:49:19 -04:00
Tom Lane	7e81a18d49	Fix parallel-safety markings for contrib/dblink. As shown by buildfarm reports, dblink_build_sql_insert and dblink_build_sql_update are not parallel safe, because they may attempt to access temporary tables of the local session. Although dblink_build_sql_delete doesn't actually touch the contents of the referenced table, it seems consistent and prudent to mark it PARALLEL RESTRICTED too.	2016-06-17 23:08:21 -04:00
Robert Haas	71d05a2c7b	pg_visibility: Add pg_truncate_visibility_map function. This requires some core changes as well so that we can properly WAL-log the truncation. Specifically, it changes the format of the XLOG_SMGR_TRUNCATE WAL record, so bump XLOG_PAGE_MAGIC. Patch by me, reviewed but not fully endorsed by Andres Freund.	2016-06-17 17:37:30 -04:00
Robert Haas	20eb2731b7	Update dblink extension for parallel query. Almost all functions provided by this extension are PARALLEL RESTRICTED. Mostly, that's because the leader's TCP connections won't be shared with the workers, but in some cases like dblink_get_pkey it's because they obtain locks which might be released early if taken within a parallel worker. dblink_fdw_validator probably can't be used in a query anyway, but there would be no problem from the point of view of parallel query if it were, so it's PARALLEL SAFE. Andreas Karlsson	2016-06-17 15:18:44 -04:00
Robert Haas	177c56d608	postgres_fdw: Rephrase comment. Per gripe from Thomas Munro, who only complained about a more localized problem, but I couldn't resist a bit more wordsmithing.	2016-06-17 13:02:22 -04:00
Robert Haas	e472ce9624	Add integrity-checking functions to pg_visibility. The new pg_check_visible() and pg_check_frozen() functions can be used to verify that the visibility map bits for a relation's data pages match the actual state of the tuples on those pages. Amit Kapila and Robert Haas, reviewed (in earlier versions) by Andres Freund. Additional testing help by Thomas Munro.	2016-06-15 14:33:58 -04:00
Robert Haas	13e7453135	Update xml2 extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-14 15:49:32 -04:00
Robert Haas	20f6c3a2a1	Update uuid-ossp extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-14 14:56:21 -04:00
Robert Haas	202ac08c08	Update unaccent extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-14 14:55:49 -04:00
Robert Haas	6b7d11ffda	Update sslinfo extension for parallel query. All functions provided by this extension are PARALLEL RESTRICTED, because they provide information about the connection state. Parallel workers don't have this information and therefore these functions can't be executed in a worker (but they can be present in a query some other part of which uses parallelism). Andreas Karlsson	2016-06-14 14:52:55 -04:00
Robert Haas	2910fc8239	Update extensions with GIN/GIST support for parallel query. Commit `749a787c5b` bumped the extension version on all of these extensions already, and we haven't had a release since then, so we can make further changes without bumping the extension version again. Take this opportunity to mark all of the functions exported by these modules PARALLEL SAFE -- except for pg_trgm's set_limit(). Mark that one PARALLEL RESTRICTED, because it makes a persistent change to a GUC value. Note that some of the markings added by this commit don't have any effect; for example, gseg_picksplit() isn't likely to be mentioned explicitly in a query and therefore it's parallel-safety marking will never be consulted. But this commit just marks everything for consistency: if it were somehow used in a query, that would be fine as far as parallel query is concerned, since it does not consult any backend-private state, attempt to write data, etc. Andreas Karlsson, with a few revisions by me.	2016-06-14 13:34:37 -04:00
Robert Haas	131c7e70b4	postgres_fdw: Check PlaceHolderVars before pushing down a join. As discovered by Andreas Seltenreich via sqlsmith, it's possible for a remote join to need to generate a target list which contains a PlaceHolderVar which would need to be evaluated on the remote server. This happens when we try to push down a join tree which contains outer joins and the nullable side of the join contains a subquery which evauates some expression which can go to NULL above the level of the join. Since the deparsing logic can't build a remote query that involves subqueries, it fails while trying to produce an SQL query that can be sent to the remote side. Detect such cases and don't try to push down the join at all. It's actually fine to push down the join if the PlaceHolderVar needs to be evaluated at the current join level. This patch makes a small change to build_tlist_to_deparse so that this case will work. Amit Langote, Ashutosh Bapat, and me.	2016-06-14 11:48:27 -04:00
Tom Lane	5484c0a980	Minor fixes in contrib installation scripts. Extension scripts should never use CREATE OR REPLACE for initial object creation. If there is a collision with a pre-existing (probably user-created) object, we want extension installation to fail, not silently overwrite the user's object. Bloom and sslinfo both violated this precept. Also fix a number of scripts that had no standard header (the file name comment and the \echo...\quit guard). Probably the \echo...\quit hack is less important now than it was in 9.1 days, but that doesn't mean that individual extensions get to choose whether to use it or not. And fix a couple of evident copy-and-pasteos in file name comments. No need for back-patch: the REPLACE bugs are both new in 9.6, and the rest of this is pretty much cosmetic. Andreas Karlsson and Tom Lane	2016-06-14 10:47:06 -04:00
Robert Haas	332fdbef20	postgres_fdw: Promote an Assert() to elog(). Andreas Seltenreich reports that it is possible for a PlaceHolderVar to creep into this tlist, and I fear that even after that's fixed we might have other, similar bugs in this area either now or in the future. There's a lot of action-at-a-distance here, because the validity of this assertion depends on core planner behavior; so, let's use elog() to make sure we catch this even in non-assert builds, rather than just crashing.	2016-06-14 09:00:12 -04:00
Noah Misch	3be0a62ffe	Finish pgindent run for 9.6: Perl files.	2016-06-12 04:19:56 -04:00
Robert Haas	a8501ba119	Update pgstattuple extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-10 10:42:03 -04:00
Robert Haas	496899ccc2	Update pg_stat_statements extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Given the general prohibition against write operations in parallel queries, it is perhaps a bit surprising that pg_stat_statements_reset() is parallel safe. But since it only modifies shared memory, not the database, it's OK. Andreas Karlsson	2016-06-10 10:42:01 -04:00
Robert Haas	3d8fc8c68c	Schema-qualify some references to regprocedure. Andreas Karlsson, per a gripe from Tom Lane.	2016-06-10 10:41:58 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Robert Haas	9164deea2f	Update pgrowlocks extension for parallel query. The pgrowlocks function provided by this extension is PARALLEL SAFE. Andreas Karlsson	2016-06-09 17:35:53 -04:00
Robert Haas	6b3586caa8	Update pg_prewarm extension for parallel query. The pg_prewarm function provided by this extension is PARALLEL SAFE. Andreas Karlsson	2016-06-09 17:18:18 -04:00
Robert Haas	42d4257a06	Update pg_freespacemap extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-09 17:18:16 -04:00
Robert Haas	0dbf3ce0e0	Update pgcrypto extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-09 17:18:14 -04:00
Robert Haas	06d7fd6e29	Update pg_buffercache extension for parallel query. The pg_buffercache_pages function provided by this extension is PARALLEL SAFE. Andreas Karlsson	2016-06-09 17:18:12 -04:00
Robert Haas	e3b607cd7a	Update pageinspect extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-09 17:18:09 -04:00
Tom Lane	749a787c5b	Handle contrib's GIN/GIST support function signature changes honestly. In commits `9ff60273e3` and `dbe2328959` I (tgl) fixed the signatures of a bunch of contrib's GIN and GIST support functions so that they would pass validation by the recently-added amvalidate functions. The backend does not actually consult or check those signatures otherwise, so I figured this was basically cosmetic and did not require an extension version bump. However, Alexander Korotkov pointed out that that would leave us in a pretty messy situation if we ever wanted to redefine those functions later, because there wouldn't be a unique way to name them. Since we're going to be bumping these extensions' versions anyway for parallel-query cleanups, let's take care of this now. Andreas Karlsson, adjusted for more search-path-safety by me	2016-06-09 16:44:25 -04:00
Alvaro Herrera	4f04b66f97	Fix loose ends for SQL ACCESS METHOD objects COMMENT ON ACCESS METHOD was missing; add it, along psql tab-completion support for it. psql was also missing a way to list existing access methods; the new \dA command does that. Also add tab-completion support for DROP ACCESS METHOD. Author: Michael Paquier Discussion: https://www.postgresql.org/message-id/CAB7nPqTzdZdu8J7EF8SXr_R2U5bSUUYNOT3oAWBZdEoggnwhGA@mail.gmail.com	2016-06-07 17:59:34 -04:00
Robert Haas	e7880e5d39	Update lo extension for parallel query. The lo_oid function provided by this extension is PARALLEL SAFE. Andreas Karlsson	2016-06-07 11:26:42 -04:00
Robert Haas	b79b8d8f55	Update isn extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-07 11:26:42 -04:00
Robert Haas	1ab194a3a9	Update intagg extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-07 11:26:41 -04:00
Robert Haas	ffab82fbda	Update fuzzystrmatch extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-07 11:26:41 -04:00
Robert Haas	50e5226bb3	Update earthdistance extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-07 11:26:41 -04:00
Robert Haas	a89b4b1be0	Update citext extension for parallel query. All citext functions are PARALLEL SAFE, and a couple of them can benefit from having aggregate combine functions. Andreas Karlsson	2016-06-07 11:26:41 -04:00
Tom Lane	8a859691d5	Properly initialize SortSupport for ORDER BY rechecks in nodeIndexscan.c. Fix still another bug in commit `35fcb1b3d`: it failed to fully initialize the SortSupport states it introduced to allow the executor to re-check ORDER BY expressions containing distance operators. That led to a null pointer dereference if the sortsupport code tried to use ssup_cxt. The problem only manifests in narrow cases, explaining the lack of previous field reports. It requires a GiST-indexable distance operator that lacks SortSupport and is on a pass-by-ref data type, which among core+contrib seems to be only btree_gist's interval opclass; and it requires the scan to be done as an IndexScan not an IndexOnlyScan, which explains how btree_gist's regression test didn't catch it. Per bug #14134 from Jihyun Yu. Peter Geoghegan Report: <20160511154904.2603.43889@wrigleys.postgresql.org>	2016-06-05 11:53:06 -04:00
Tom Lane	de33af8823	Update contrib/tsearch2/expected/tsearch2_1.out for phrase FTS. Commits `bb140506d` and `38627f687` didn't bother with this. Per buildfarm member magpie.	2016-06-04 00:49:42 -04:00
Tom Lane	ee4af347ba	Measure Bloom index signature-length reloption in bits, not words. Per discussion, this is a more understandable and future-proof way of exposing the setting to users. On-disk, we can still store it in words, so as to not break on-disk compatibility with beta1. Along the way, clean up the code associated with Bloom reloptions. Provide explicit macros for default and maximum lengths rather than having magic numbers buried in multiple places in the code. Drop the adjustBloomOptions() code altogether: it was useless in view of the fact that reloptions.c already performed default-substitution and range checking for the options. Rename a couple of macros and types for more clarity. Discussion: <23767.1464926580@sss.pgh.pa.us>	2016-06-03 10:52:45 -04:00
Tom Lane	abaffa9075	Fix contrib/bloom to work for unlogged indexes. blbuildempty did not do even approximately the right thing: it tried to add a metapage to the relation's regular data fork, which already has one at that point. It should look like the ambuildempty methods for all the standard index types, ie, initialize a metapage image in some transient storage and then write it directly to the init fork. To support that, refactor BloomInitMetapage into two functions. In passing, fix BloomInitMetapage so it doesn't leave the rd_options field of the index's relcache entry pointing at transient storage. I'm not sure this had any visible consequence, since nothing much else is likely to look at a bloom index's rd_options, but it's certainly poor practice. Per bug #14155 from Zhou Digoal. Report: <20160524144146.22598.42558@wrigleys.postgresql.org>	2016-05-24 21:04:35 -04:00
Tom Lane	e13ac5586c	Avoid possible crash in contrib/bloom's blendscan(). It's possible to begin and end an indexscan without ever calling amrescan. contrib/bloom, unlike every other index AM, allocated its "scan->opaque" storage at amrescan time, and thus would crash in amendscan if amrescan hadn't been called. We could fix this by putting in a null-pointer check in blendscan, but I see no very good reason why contrib/bloom should march to its own drummer in this respect. Let's move that initialization to blbeginscan instead. Per report from Jeff Janes.	2016-05-17 17:01:18 -04:00
Robert Haas	02a568a027	postgres_fdw: Fix the fix for crash when pushing down multiple joins. Commit `3151f16e18` was intended to be a commit of a patch from Ashutosh Bapat, but instead I mistakenly committed an earlier version from Michael Paquier (because both patches were submitted with the same filename, and I confused them). Michael's patch fixes the crash but doesn't actually implement the correct test. Repair the incorrect logic, and also expand the comments considerably so that this is all more clear. Ashutosh Bapat and Robert Haas	2016-05-16 11:28:28 -04:00
Robert Haas	1b812afb0e	Fix multiple problems in postgres_fdw query cancellation logic. First, even if we cancel a query, we still have to roll back the containing transaction; otherwise, the session will be left in a failed transaction state. Second, we need to support canceling queries whe aborting a subtransaction as well as when aborting a toplevel transaction. Etsuro Fujita, reviewed by Michael Paquier	2016-05-16 11:19:10 -04:00
Tom Lane	d94977ef1c	Ensure plan stability in contrib/btree_gist regression test. Buildfarm member skink failed with symptoms suggesting that an auto-analyze had happened and changed the plan displayed for a test query. Although this is evidently of low probability, regression tests that sometimes fail are no fun, so add commands to force a bitmap scan to be chosen.	2016-05-12 20:04:38 -04:00
Robert Haas	8826d85078	Tweak a few more things in preparation for upcoming pgindent run. These adjustments adjust code and comments in minor ways to prevent pgindent from mangling them. Among other things, I tried to avoid situations where pgindent would emit "a +b" instead of "a + b", and I tried to avoid having it break up inline comments across multiple lines.	2016-05-03 10:52:25 -04:00
Heikki Linnakangas	d22b85fbd4	Remove unused macros. CHECK_PAGE_OFFSET_RANGE() has been unused forever. CHECK_RELATION_BLOCK_RANGE() has been unused in pgstatindex.c ever since bt_page_stats() and bt_page_items() functions were moved from pgstattuple to pageinspect module. It still exists in pageinspect/btreefuncs.c. Daniel Gustafsson	2016-05-02 10:07:49 +03:00
Tom Lane	f050423052	Revert "Convert contrib/seg's bool-returning SQL functions to V1 call convention." This reverts commit `c8e81afc60`. That turns out to have been based on a faulty diagnosis of why the VS2015 build was misbehaving. Instead, we need to fix DatumGetBool().	2016-04-28 11:46:07 -04:00
Teodor Sigaev	f8467f7da8	Prevent to use magic constants Use macroses for definition amstrategies/amsupport fields instead of hardcoded values. Author: Nikolay Shaplov with addition for contrib/bloom	2016-04-28 16:39:25 +03:00
Tom Lane	c8e81afc60	Convert contrib/seg's bool-returning SQL functions to V1 call convention. It appears that we can no longer get away with using V0 call convention for bool-returning functions in newer versions of MSVC. The compiler seems to generate code that doesn't clear the higher-order bits of the result register, causing the bool result Datum to often read as "true" when "false" was intended. This is not very surprising, since the function thinks it's returning a bool-width result but fmgr_oldstyle assumes that V0 functions return "char *"; what's surprising is that that hack worked for so long on so many platforms. The only functions of this description in core+contrib are in contrib/seg, which we'd intentionally left mostly in V0 style to serve as a warning canary if V0 call convention breaks. We could imagine hacking things so that they're still V0 (we'd have to redeclare the bool-returning functions as returning some suitably wide integer type, like size_t, at the C level). But on the whole it seems better to convert 'em to V1. We can still leave the pointer- and int-returning functions in V0 style, so that the test coverage isn't gone entirely. Back-patch to 9.5, since our intention is to support VS2015 in 9.5 and later. There's no SQL-level change in the functions' behavior so back-patching should be safe enough. Discussion: <22094.1461273324@sss.pgh.pa.us> Michael Paquier, adjusted some by me	2016-04-22 11:54:23 -04:00
Tom Lane	14216649f3	PGDLLIMPORT-ify old_snapshot_threshold. Revert commit `7cb1db1d95`, which represented a misunderstanding of the problem (if snapmgr.h weren't already included in bufmgr.h, things wouldn't compile anywhere). Instead install what I think is the real fix.	2016-04-21 14:33:34 -04:00
Kevin Grittner	7cb1db1d95	Include snapmgr.h in blscan.c Windows builds on buildfarm are failing because old_snapshot_threshold is not found in the bloom filter contrib module.	2016-04-21 11:51:20 -05:00
Robert Haas	f039eaac71	Allow queries submitted by postgres_fdw to be canceled. This fixes a problem which is not new, but with the advent of direct foreign table modification in `0bf3ae88af`, it's somewhat more likely to be annoying than previously. So, arrange for a local query cancelation to propagate to the remote side. Michael Paquier, reviewed by Etsuro Fujita. Original report by Thom Brown.	2016-04-21 10:49:09 -04:00
Robert Haas	5b1f9ce1d9	postgres_fdw: Don't push down certain full joins. If there's a filter condition on either side of a full outer join, it is neither correct to attach it to the join's ON clause nor to throw it into the toplevel WHERE clause. Just don't push down the join in that case. To maximize the number of cases where we can still push down full joins, push inner join conditions into the ON clause at the first opportunity rather than postponing them to the top-level WHERE clause. This produces nicer SQL, anyway. This bug was introduced in `e4106b2528`. Ashutosh Bapat, per report from Rajkumar Raghuwanshi.	2016-04-20 23:54:19 -04:00
Kevin Grittner	a343e223a5	Revert no-op changes to BufferGetPage() The reverted changes were intended to force a choice of whether any newly-added BufferGetPage() calls needed to be accompanied by a test of the snapshot age, to support the "snapshot too old" feature. Such an accompanying test is needed in about 7% of the cases, where the page is being used as part of a scan rather than positioning for other purposes (such as DML or vacuuming). The additional effort required for back-patching, and the doubt whether the intended benefit would really be there, have indicated it is best just to rely on developers to do the right thing based on comments and existing usage, as we do with many other conventions. This change should have little or no effect on generated executable code. Motivated by the back-patching pain of Tom Lane and Robert Haas	2016-04-20 08:31:19 -05:00
Robert Haas	da7d44b627	postgres_fdw: Clean up handling of system columns. Previously, querying the xmin column of a single postgres_fdw foreign table fetched the tuple length, xmax the typmod, and cmin or cmax the composite type OID of the tuple. However, when you queried several such tables and the join got shipped to the remote side, these columns ended up containing the remote values of the corresponding columns. Both behaviors are rather unprincipled, the former for obvious reasons and the latter because the remote values of these columns don't have any local significance; our transaction IDs are in a different space than those of the remote machine. Clean this up by setting all of these fields to 0 in both cases. Also fix the handling of tableoid to be sane. Robert Haas and Ashutosh Bapat, reviewed by Etsuro Fujita.	2016-04-15 12:08:14 -04:00
Tom Lane	6a3d3965d6	Fix core dump in ReorderBufferRestoreChange on alignment-picky platforms. When re-reading an update involving both an old tuple and a new tuple from disk, reorderbuffer.c was careless about whether the new tuple is suitably aligned for direct access --- in general, it isn't. We'd missed seeing this in the buildfarm because the contrib/test_decoding tests exercise this code path only a few times, and by chance all of those cases have old tuples with length a multiple of 4, which is usually enough to make the access to the new tuple's t_len safe. For some still-not-entirely-clear reason, however, Debian's sparc build gets a bus error, as reported by Christoph Berg; perhaps it's assuming 8-byte alignment of the pointer? The lack of previous field reports is probably because you need all of these conditions to trigger a crash: an alignment-picky platform (not Intel), a transaction large enough to spill to disk, an update within that xact that changes a primary-key field and has an odd-length old tuple, and of course logical decoding tracing the transaction. Avoid the alignment assumption by using memcpy instead of fetching t_len directly, and add a test case that exposes the crash on picky platforms. Back-patch to 9.4 where the bug was introduced. Discussion: <20160413094117.GC21485@msg.credativ.de>	2016-04-14 19:42:21 -04:00
Andres Freund	be65eddd80	Add required database and origin filtering for logical messages. Logical messages, added in `3fe3511d05`, during decoding failed to filter messages emitted in other databases and messages emitted "under" a replication origin the output plugin isn't interested in. Add tests to verify that both types of filtering actually work. While touching message.sql remove hunk obsoleted by `d25379e`. Bump XLOG_PAGE_MAGIC because xl_logical_message changed and because `3fe3511d05` had omitted doing so. `3fe3511d05` additionally didn't bump catversion, but `7a542700d` has done so since. Author: Petr Jelinek Reported-By: Andres Freund Discussion: 20160406142513.wotqy3ba3kanr423@alap3.anarazel.de	2016-04-13 17:38:54 -07:00
Tom Lane	5713f03973	Improve API of GenericXLogRegister(). Rename this function to GenericXLogRegisterBuffer() to make it clearer what it does, and leave room for other sorts of "register" actions in future. Also, replace its "bool isNew" argument with an integer flags argument, so as to allow adding more flags in future without an API break. Alexander Korotkov, adjusted slightly by me	2016-04-12 11:42:06 -04:00
Teodor Sigaev	813b456ea2	Add page id to bloom index Added to ensure that bloom index pages can be distinguished from other pages by pg_filedump. Because there wasn't any public/production versions before, it doesn't pay attention to any compatibility issues. Per notice from Tom Lane	2016-04-12 18:03:01 +03:00
Andres Freund	48354581a4	Allow Pin/UnpinBuffer to operate in a lockfree manner. Pinning/Unpinning a buffer is a very frequent operation; especially in read-mostly cache resident workloads. Benchmarking shows that in various scenarios the spinlock protecting a buffer header's state becomes a significant bottleneck. The problem can be reproduced with pgbench -S on larger machines, but can be considerably worse for queries which touch the same buffers over and over at a high frequency (e.g. nested loops over a small inner table). To allow atomic operations to be used, cram BufferDesc's flags, usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable; that allows to manipulate them together using 32bit compare-and-swap operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could be lifted by using a 64bit field, but it's not a realistic configuration atm). As not all operations can easily implemented in a lockfree manner, implement the previous buf_hdr_lock via a flag bit in the atomic variable. That way we can continue to lock the header in places where it's needed, but can get away without acquiring it in the more frequent hot-paths. There's some additional operations which can be done without the lock, but aren't in this patch; but the most important places are covered. As bufmgr.c now essentially re-implements spinlocks, abstract the delay logic from s_lock.c into something more generic. It now has already two users, and more are coming up; there's a follupw patch for lwlock.c at least. This patch is based on a proof-of-concept written by me, which Alexander Korotkov made into a fully working patch; the committed version is again revised by me. Benchmarking and testing has, amongst others, been provided by Dilip Kumar, Alexander Korotkov, Robert Haas. On a large x86 system improvements for readonly pgbench, with a high client count, of a factor of 8 have been observed. Author: Alexander Korotkov and Andres Freund Discussion: 2400449.GjM57CE0Yg@dinodell	2016-04-10 20:12:32 -07:00
Tom Lane	cf223c3bf5	Improve contrib/bloom regression test using code coverage info. Originally, this test created a 100000-row test table, which made it run rather slowly compared to other contrib tests. Investigation with gcov showed that we got no further improvement in code coverage after the first 700 or so rows, making the large table 99% a waste of time. Cut it back to 2000 rows to fix the runtime problem and still leave some headroom for testing behaviors that may appear later. A closer look at the gcov results showed that the main coverage omissions in contrib/bloom occurred because the test never filled more than one entry in the notFullPage array; which is unsurprising because it exercised index cleanup only in the scenario of complete table deletion, allowing every page in the index to become deleted rather than not-full. Add testing that allows the not-full path to be exercised as well. Also, test the amvalidate function, because blvalidate.c had zero coverage without that, and besides it's a good idea to check for mistakes in the bloom opclass definitions.	2016-04-10 13:12:24 -04:00
Tom Lane	80cf18910c	Get rid of blinsert()'s use of GenericXLogUnregister(). That routine is dangerous, and unnecessary once we get rid of this one caller. In passing, fix failure to clean up temp memory context, or switch back to caller's context, during slowest exit path.	2016-04-09 15:39:14 -04:00
Kevin Grittner	848ef42bb8	Add the "snapshot too old" feature This feature is controlled by a new old_snapshot_threshold GUC. A value of -1 disables the feature, and that is the default. The value of 0 is just intended for testing. Above that it is the number of minutes a snapshot can reach before pruning and vacuum are allowed to remove dead tuples which the snapshot would otherwise protect. The xmin associated with a transaction ID does still protect dead tuples. A connection which is using an "old" snapshot does not get an error unless it accesses a page modified recently enough that it might not be able to produce accurate results. This is similar to the Oracle feature, and we use the same SQLSTATE and error message for compatibility.	2016-04-08 14:36:30 -05:00
Kevin Grittner	8b65cf4c5e	Modify BufferGetPage() to prepare for "snapshot too old" feature This patch is a no-op patch which is intended to reduce the chances of failures of omission once the functional part of the "snapshot too old" patch goes in. It adds parameters for snapshot, relation, and an enum to specify whether the snapshot age check needs to be done for the page at this point. This initial patch passes NULL for the first two new parameters and BGP_NO_SNAPSHOT_TEST for the third. The follow-on patch will change the places where the test needs to be made.	2016-04-08 14:30:10 -05:00
Teodor Sigaev	8b99edefca	Revert CREATE INDEX ... INCLUDING ... It's not ready yet, revert two commits `690c543550` - unstable test output `386e3d7609` - patch itself	2016-04-08 21:52:13 +03:00
Teodor Sigaev	38627f6878	Fix output of regression test of contrib/tsearch2 Just forget to add in `1ec4c7c055`	2016-04-08 20:37:12 +03:00
Teodor Sigaev	386e3d7609	CREATE INDEX ... INCLUDING (column[, ...]) Now indexes (but only B-tree for now) can contain "extra" column(s) which doesn't participate in index structure, they are just stored in leaf tuples. It allows to use index only scan by using single index instead of two or more indexes. Author: Anastasia Lubennikova with minor editorializing by me Reviewers: David Rowley, Peter Geoghegan, Jeff Janes	2016-04-08 19:45:59 +03:00
Peter Eisentraut	339025c68f	Replace printf format %i by %d see also `ce8d7bb644`	2016-04-08 12:42:58 -04:00
Peter Eisentraut	8b737f9084	Fix printf format	2016-04-08 12:34:33 -04:00
Teodor Sigaev	bb140506df	Phrase full text search. Patch introduces new text search operator (<-> or <DISTANCE>) into tsquery. On-disk and binary in/out format of tsquery are backward compatible. It has two side effect: - change order for tsquery, so, users, who has a btree index over tsquery, should reindex it - less number of parenthesis in tsquery output, and tsquery becomes more readable Authors: Teodor Sigaev, Oleg Bartunov, Dmitry Ivanov Reviewers: Alexander Korotkov, Artur Zakirov	2016-04-07 18:44:18 +03:00
Tom Lane	de94e2af18	Run pgindent on a batch of (mostly-planner-related) source files. Getting annoyed at the amount of unrelated chatter I get from pgindent'ing Rowley's unique-joins patch. Re-indent all the files it touches.	2016-04-06 11:34:02 -04:00
Simon Riggs	d25379eb23	Modify test_decoding/messages to remove non-ascii chars	2016-04-06 14:55:11 +01:00
Simon Riggs	3fe3511d05	Generic Messages for Logical Decoding API and mechanism to allow generic messages to be inserted into WAL that are intended to be read by logical decoding plugins. This commit adds an optional new callback to the logical decoding API. Messages are either text or bytea. Messages can be transactional, or not, and are identified by a prefix to allow multiple concurrent decoding plugins. (Not to be confused with Generic WAL records, which are intended to allow crash recovery of extensible objects.) Author: Petr Jelinek and Andres Freund Reviewers: Artur Zakirov, Tomas Vondra, Simon Riggs Discussion: 5685F999.6010202@2ndquadrant.com	2016-04-06 10:05:41 +01:00
Teodor Sigaev	eb7308d298	Fix typo Michael Paquier	2016-04-04 14:55:29 +03:00
Tom Lane	a75a418d07	Clean up dubious code in contrib/seg. The restore() function assumed that the result of sprintf() with %e format would necessarily contain an 'e', which is false: what if the supplied number is an infinity or NaN? If that did happen, we'd get a null-pointer-dereference core dump. The case appears impossible currently, because seg_in() does not accept such values, and there are no seg-creating functions that would create one. But it seems unwise to rely on it never happening in future. Quite aside from that, the code was pretty ugly: it relied on modifying a static format string when it could use a "*" precision argument, and it used strtok() entirely gratuitously, and it stripped off trailing spaces by hand instead of just not asking for them to begin with. Coverity noticed the potential null pointer dereference (though I wonder why it didn't complain years ago, since this code is ancient). Since this is just code cleanup and forestalling a hypothetical future bug, there seems no need for back-patching.	2016-04-03 17:36:53 -04:00
Tom Lane	8f75fd1f40	Fix contrib/bloom to not fail under CLOBBER_CACHE_ALWAYS. The code was supposing that rd_amcache wouldn't disappear from under it during a scan; which is wrong. Copy the data out of the relcache rather than trying to reference it there.	2016-04-03 15:16:07 -04:00
Tom Lane	a9284849b4	Clean up some stuff in new contrib/bloom module. Coverity complained about implicit sign-extension in the BloomPageGetFreeSpace macro, probably because sizeOfBloomTuple isn't wide enough for size calculations. No overflow is really possible as long as maxoff and sizeOfBloomTuple are small enough to represent a realistic situation, but it seems like a good idea to declare sizeOfBloomTuple as Size not int32. Add missing check on BloomPageAddItem() result, again from Coverity. Avoid core dump due to not allocating so->sign array when scan->numberOfKeys is zero. Also thanks to Coverity. Use FLEXIBLE_ARRAY_MEMBER rather than declaring an array as size 1 when it isn't necessarily. Very minor beautification of related code. Unfortunately, none of the Coverity-detected mistakes look like they could account for the remaining buildfarm unhappiness with this module. It's barely possible that the FLEXIBLE_ARRAY_MEMBER mistake does account for that, if it's enabling bogus compiler optimizations; but I'm not terribly optimistic. We probably still have bugs to find here.	2016-04-03 14:17:23 -04:00
Tom Lane	5a5b917184	Add missing "static". Per buildfarm member pademelon.	2016-04-02 13:59:11 -04:00
Teodor Sigaev	9c50372d20	Fix condition in `e9e441c9fa` Comment is right, but if - not.	2016-04-02 18:38:16 +03:00
Teodor Sigaev	e9e441c9fa	Prevent mark as deleted and as 'has free space' page in bloom module Vacuum might put page into list of pages with some free space and mark as deleted at the same time.	2016-04-02 14:20:46 +03:00
Teodor Sigaev	80afb62db0	Fixes in bloom contrib module Looking at result of buildfarm member jaguarundi it seems to me that BloomOptions isn't inited sometime, but I don't see yet how it's possible. Nevertheless, check of signature length's is missed, so, add a limit of it. Also add missed GenericXLogAbort() in case of already deleted page in vacuum + minor code refactoring.	2016-04-02 13:47:04 +03:00
Noah Misch	4ad6f13500	Copyedit comments and documentation.	2016-04-01 21:53:10 -04:00
Teodor Sigaev	27f3bbfad4	Fixes in bloom contrib module missed during review - macroses llike (var & FLAG) are changed to ((var & FLAG) != 0) - do not copy uninitialized part of notFullPage array to page	2016-04-01 20:09:13 +03:00
Teodor Sigaev	9ee014fc89	Bloom index contrib module Module provides new access method. It is actually a simple Bloom filter implemented as pgsql's index. It could give some benefits on search with large number of columns. Module is a single way to test generic WAL interface committed earlier. Author: Teodor Sigaev, Alexander Korotkov Reviewers: Aleksander Alekseev, Michael Paquier, Jim Nasby	2016-04-01 16:42:24 +03:00
Robert Haas	5d4171d1c7	Don't require a user mapping for FDWs to work. Commit `fbe5a3fb73` accidentally changed this behavior; put things back the way they were, and add some regression tests. Report by Andres Freund; patch by Ashutosh Bapat, with a bit of kibitzing by me.	2016-03-28 21:50:28 -04:00
Alvaro Herrera	3e1338475f	Add missing checks to some of pageinspect's BRIN functions brin_page_type() and brin_metapage_info() did not enforce being called by superuser, like other pageinspect functions that take bytea do. Since they don't verify the passed page thoroughly, it is possible to use them to read the server memory with a carefully crafted bytea value, up to a file kilobytes from where the input bytea is located. Have them throw errors if called by a non-superuser. Report and initial patch: Andreas Seltenreich Security: CVE-2016-3065	2016-03-28 10:57:42 -03:00
Andres Freund	1a7a43672b	Don't use !! but != 0/NULL to force boolean evaluation. I introduced several uses of !! to force bit arithmetic to be boolean, but per discussion the project prefers != 0/NULL. Discussion: CA+TgmoZP5KakLGP6B4vUjgMBUW0woq_dJYi0paOz-My0Hwt_vQ@mail.gmail.com	2016-03-27 18:10:19 +02:00
Robert Haas	3151f16e18	postgres_fdw: Fix crash when pushing down multiple joins. A join clause might mention multiple relations on either side, so it need not be the case that a given joinrel's constituent relations are all on one side of the join clause or all on the other. Report by Rajkumar Raghuwanshi. Analysis and fix by Michael Paquier and Ashutosh Bapat.	2016-03-23 12:28:01 -04:00
Tom Lane	92b7902deb	Clean up some Coverity complaints about commit `0bf3ae88af`. The two get_tle_by_resno() calls introduced by this commit lacked any check for a NULL return, unlike any other calls of that function anywhere in our tree. Coverity quite properly complained about it. Also fix a misindented line in process_query_params(), which Coverity also complained about on the grounds that the bad indentation suggested possible programmer misinterpretation.	2016-03-21 12:00:02 -04:00
Tom Lane	d5351fcb03	Fix phony .PHONY. A couple makefiles had misspelled the magic .PHONY target as PHONY.	2016-03-19 17:19:37 -04:00
Robert Haas	0bf3ae88af	Directly modify foreign tables. postgres_fdw can now sent an UPDATE or DELETE statement directly to the foreign server in simple cases, rather than sending a SELECT FOR UPDATE statement and then updating or deleting rows one-by-one. Etsuro Fujita, reviewed by Rushabh Lathia, Shigeru Hanada, Kyotaro Horiguchi, Albe Laurenz, Thom Brown, and me.	2016-03-18 13:55:52 -04:00
Robert Haas	2d8a1e22b1	Various minor corrections of and improvements to comments. Aleksander Alekseev	2016-03-18 09:38:59 -04:00
Teodor Sigaev	aa698d7535	pg_trgm's set_limit() now uses SetConfigOption() Deprecated set_limit() is modified to use SetConfigOption() to set similarity_threshold which is actually an instance of pg_trgm.similarity_threshold GUC variable. Previous coding directly sets similarity_threshold what could cause an inconsistency between states of actual variable and GUC representation. Per gripe from Tom Lane	2016-03-18 12:26:27 +03:00
Teodor Sigaev	e4b523e5b5	Add files forgotten in `f576b17cd6`	2016-03-16 19:23:41 +03:00
Teodor Sigaev	f576b17cd6	Add word_similarity to pg_trgm contrib module. Patch introduces a concept of similarity over string and just a word from another string. Version of extension is not changed because 1.2 was already introduced in 9.6 release cycle, so, there wasn't a public version. Author: Alexander Korotkov, Artur Zakirov	2016-03-16 18:59:21 +03:00
Teodor Sigaev	5871b88487	GUC variable pg_trgm.similarity_threshold insead of set_limit() Use GUC variable pg_trgm.similarity_threshold insead of set_limit()/show_limit() which was introduced when defining GUC varuables by modules was absent. Author: Artur Zakirov	2016-03-16 17:44:58 +03:00
Teodor Sigaev	ce91b9209f	fix typo in comment	2016-03-16 17:18:14 +03:00
Teodor Sigaev	9a206d063c	Improve script generating unaccent rules Script now use the standard Unicode transliterator Latin-ASCII. Author: Leonard Benedetti	2016-03-16 16:47:03 +03:00
Robert Haas	3aff33aa68	Fix typos. Oskari Saarenmaa	2016-03-15 18:06:11 -04:00
Robert Haas	4a46a99d89	postgres_fdw: make_tuple_from_result_row should set cur_attno for ctid. There's no reason for this function to do this for every other attribute number and omit it for CTID, especially since conversion_error_callback has code to handle that case. This seems to be an oversight in commit `e690b95150`. Etsuro Fujita	2016-03-15 16:51:56 -04:00
Tom Lane	28048cbaa2	Allow callers of create_foreignscan_path to specify nondefault PathTarget. Although the default choice of rel->reltarget should typically be sufficient for scan or join paths, it's not at all sufficient for the purposes PathTargets were invented for; in particular not for upper-relation Paths. So break API compatibility by adding a PathTarget argument to create_foreignscan_path(). To ease updating of existing code, accept a NULL value of the argument as selecting rel->reltarget.	2016-03-14 17:31:28 -04:00
Tom Lane	307c78852f	Rethink representation of PathTargets. In commit `19a541143a` I did not make PathTarget a subtype of Node, and embedded a RelOptInfo's reltarget directly into it rather than having a separately-allocated Node. In hindsight that was misguided micro-optimization, enabled by the fact that at that point we didn't have any Paths with custom PathTargets. Now that PathTarget processing has been fleshed out some more, it's easier to see that it's better to have PathTarget as an indepedent Node type, even if it does cost us one more palloc to create a RelOptInfo. So change it while we still can. This commit just changes the representation, without doing anything more interesting than that.	2016-03-14 16:59:59 -04:00
Robert Haas	6be84eeb8d	Update more comments for `96198d94cb`. Etsuro Fujita, reviewed (though not completely endorsed) by Ashutosh Bapat, and slightly expanded by me.	2016-03-14 14:29:12 -04:00
Magnus Hagander	7a8d874836	Rename auto_explain.sample_ratio to sample_rate Per suggestion from Tomas Vondra Author: Julien Rouhaud	2016-03-13 13:18:03 +01:00
Tom Lane	23a27b039d	Widen query numbers-of-tuples-processed counters to uint64. This patch widens SPI_processed, EState's es_processed field, PortalData's portalPos field, FuncCallContext's call_cntr and max_calls fields, ExecutorRun's count argument, PortalRunFetch's result, and the max number of rows in a SPITupleTable to uint64, and deals with (I hope) all the ensuing fallout. Some of these values were declared uint32 before, and others "long". I also removed PortalData's posOverflow field, since that logic seems pretty useless given that portalPos is now always 64 bits. The user-visible results are that command tags for SELECT etc will correctly report tuple counts larger than 4G, as will plpgsql's GET GET DIAGNOSTICS ... ROW_COUNT command. Queries processing more tuples than that are still not exactly the norm, but they're becoming more common. Most values associated with FETCH/MOVE distances, such as PortalRun's count argument and the count argument of most SPI functions that have one, remain declared as "long". It's not clear whether it would be worth promoting those to int64; but it would definitely be a large dollop of additional API churn on top of this, and it would only help 32-bit platforms which seem relatively less likely to see any benefit. Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me	2016-03-12 16:05:29 -05:00
Magnus Hagander	92f03fe76f	Allow setting sample ratio for auto_explain New configuration parameter auto_explain.sample_ratio makes it possible to log just a fraction of the queries meeting the configured threshold, to reduce the amount of logging. Author: Craig Ringer and Julien Rouhaud Review: Petr Jelinek	2016-03-11 15:08:34 +01:00
Tom Lane	364a9f47ab	Refactor pull_var_clause's API to make it less tedious to extend. In commit `1d97c19a0f` and later `c1d9579dd8`, we extended pull_var_clause's API by adding enum-type arguments. That's sort of a pain to maintain, though, because it means every time we add a new behavior we must touch every last one of the call sites, even if there's a reasonable default behavior that most of them could use. Let's switch over to using a bitmask of flags, instead; that seems more maintainable and might save a nanosecond or two as well. This commit changes no behavior in itself, though I'm going to follow it up with one that does add a new behavior. In passing, remove flatten_tlist(), which has not been used since 9.1 and would otherwise need the same API changes. Removing these enums means that optimizer/tlist.h no longer needs to depend on optimizer/var.h. Changing that caused a number of C files to need addition of #include "optimizer/var.h" (probably we can thank old runs of pgrminclude for that); but on balance it seems like a good change anyway.	2016-03-10 15:53:07 -05:00
Andres Freund	1d4a0ab19a	Avoid unlikely data-loss scenarios due to rename() without fsync. Renaming a file using rename(2) is not guaranteed to be durable in face of crashes. Use the previously added durable_rename()/durable_link_or_rename() in various places where we previously just renamed files. Most of the changed call sites are arguably not critical, but it seems better to err on the side of too much durability. The most prominent known case where the previously missing fsyncs could cause data loss is crashes at the end of a checkpoint. After the actual checkpoint has been performed, old WAL files are recycled. When they're filled, their contents are fdatasynced, but we did not fsync the containing directory. An OS/hardware crash in an unfortunate moment could then end up leaving that file with its old name, but new content; WAL replay would thus not replay it. Reported-By: Tomas Vondra Author: Michael Paquier, Tomas Vondra, Andres Freund Discussion: 56583BDD.9060302@2ndquadrant.com Backpatch: All supported branches	2016-03-09 18:53:53 -08:00
Alvaro Herrera	188f359d39	pgcrypto: support changing S2K iteration count pgcrypto already supports key-stretching during symmetric encryption, including the salted-and-iterated method; but the number of iterations was not configurable. This commit implements a new s2k-count parameter to pgp_sym_encrypt() which permits selecting a larger number of iterations. Author: Jeff Janes	2016-03-09 14:31:07 -03:00
Robert Haas	aa09cd242f	postgres_fdw: Consider foreign joining and foreign sorting together. Commit `ccd8f97922` gave us the ability to request that the remote side sort the data, and, later, commit `e4106b2528` gave us the ability to request that the remote side perform the join for us rather than doing it locally. But we could not do both things at the same time: a remote SQL query that had an ORDER BY clause would never be a join. This commit adds that capability. Ashutosh Bapat, reviewed by me.	2016-03-09 10:51:49 -05:00
Andres Freund	7a1d4a2448	ltree: Zero padding bytes when allocating memory for externally visible data. ltree/ltree_gist/ltxtquery's headers stores data at MAXALIGN alignment, requiring some padding bytes. So far we left these uninitialized. Zero those by using palloc0. Author: Andres Freund Reported-By: Andres Freund / valgrind / buildarm animal skink Backpatch: 9.1-	2016-03-08 14:59:29 -08:00
Robert Haas	d29b153f18	Fix reversed argument to bms_is_subset. Ashutosh Bapat	2016-03-08 13:59:11 -05:00
Robert Haas	ba0a198fb1	Add pg_visibility contrib module. This lets you examine the visibility map as well as page-level visibility information. I initially wrote it as a debugging aid, but was encouraged to polish it for commit. Patch by me, reviewed by Masahiko Sawada. Discussion: 56D77803.6080503@BlueTreble.com	2016-03-08 08:42:01 -05:00
Andres Freund	c8f621c43a	logical decoding: Fix handling of large old tuples with replica identity full. When decoding the old version of an UPDATE or DELETE change, and if that tuple was bigger than MaxHeapTupleSize, we either Assert'ed out, or failed in more subtle ways in non-assert builds. Normally individual tuples aren't bigger than MaxHeapTupleSize, with big datums toasted. But that's not the case for the old version of a tuple for logical decoding; the replica identity is logged as one piece. With the default replica identity btree limits that to small tuples, but that's not the case for FULL. Change the tuple buffer infrastructure to separate allocate over-large tuples, instead of always going through the slab cache. This unfortunately requires changing the ReorderBufferTupleBuf definition, we need to store the allocated size someplace. To avoid requiring output plugins to recompile, don't store HeapTupleHeaderData directly after HeapTupleData, but point to it via t_data; that leaves rooms for the allocated size. As there's no reason for an output plugin to look at ReorderBufferTupleBuf->t_data.header, remove the field. It was just a minor convenience having it directly accessible. Reported-By: Adam Dratwiński Discussion: CAKg6ypLd7773AOX4DiOGRwQk1TVOQKhNwjYiVjJnpq8Wo+i62Q@mail.gmail.com	2016-03-05 18:02:20 -08:00
Andres Freund	0bda14d54c	logical decoding: old/newtuple in spooled UPDATE changes was switched around. Somehow I managed to flip the order of restoring old & new tuples when de-spooling a change in a large transaction from disk. This happens to only take effect when a change is spooled to disk which has old/new versions of the tuple. That only is the case for UPDATEs where he primary key changed or where replica identity is changed to FULL. The tests didn't catch this because either spooled updates, or updates that changed primary keys, were tested; not both at the same time. Found while adding tests for the following commit. Backpatch: 9.4, where logical decoding was added	2016-03-05 18:02:20 -08:00
Andres Freund	d9e903f3cb	logical decoding: Tell reorderbuffer about all xids. Logical decoding's reorderbuffer keeps transactions in an LSN ordered list for efficiency. To make that's efficiently possible upper-level xids are forced to be logged before nested subtransaction xids. That only works though if these records are all looked at: Unfortunately we didn't do so for e.g. row level locks, which are otherwise uninteresting for logical decoding. This could lead to errors like: "ERROR: subxact logged without previous toplevel record". It's not sufficient to just look at row locking records, the xid could appear first due to a lot of other types of records (which will trigger the transaction to be marked logged with MarkCurrentTransactionIdLoggedIfAny). So invent infrastructure to tell reorderbuffer about xids seen, when they'd otherwise not pass through reorderbuffer.c. Reported-By: Jarred Ward Bug: #13844 Discussion: 20160105033249.1087.66040@wrigleys.postgresql.org Backpatch: 9.4, where logical decoding was added	2016-03-05 18:02:20 -08:00
Robert Haas	3bea3f88d5	postgres_fdw: When sending ORDER BY, always include NULLS FIRST/LAST. Previously, we included NULLS FIRST when appropriate but relied on the default behavior to be NULLS LAST. This is, however, not true for a sort in descending order and seems like a fragile assumption anyway. Report by Rajkumar Raghuwanshi. Patch by Ashutosh Bapat. Review comments from Michael Paquier and Tom Lane.	2016-03-04 11:37:42 -05:00
Andres Freund	1986c3c440	Force synchronous_commit=on in test_decoding's concurrent_ddl_dml.spec. Otherwise running installcheck-force on a server with synchronous_commit=off will result in the tests failing. All the other tests already do so... Backpatch: 9.4, where logical decoding was added	2016-03-03 17:22:25 -08:00
Andres Freund	7c17aac69d	logical decoding: fix decoding of a commit's commit time. When adding replication origins in `5aa235042`, I somehow managed to set the timestamp of decoded transactions to InvalidXLogRecptr when decoding one made without a replication origin. Fix that, and the wrong type of the new commit_time variable. This didn't trigger a regression test failure because we explicitly don't show commit timestamps in the regression tests, as they obviously are variable. Add a test that checks that a decoded commit's timestamp is within minutes of NOW() from before the commit. Reported-By: Weiping Qu Diagnosed-By: Artur Zakirov Discussion: 56D4197E.9050706@informatik.uni-kl.de, 56D42918.1010108@postgrespro.ru Backpatch: 9.5, where `5aa235042` originates.	2016-03-02 23:42:21 -08:00
Robert Haas	a892234f83	Change the format of the VM fork to add a second bit per page. The new bit indicates whether every tuple on the page is already frozen. It is cleared only when the all-visible bit is cleared, and it can be set only when we vacuum a page and find that every tuple on that page is both visible to every transaction and in no need of any future vacuuming. A future commit will use this new bit to optimize away full-table scans that would otherwise be triggered by XID wraparound considerations. A page which is merely all-visible must still be scanned in that case, but a page which is all-frozen need not be. This commit does not attempt that optimization, although that optimization is the goal here. It seems better to get the basic infrastructure in place first. Per discussion, it's very desirable for pg_upgrade to automatically migrate existing VM forks from the old format to the new format. That, too, will be handled in a follow-on patch. Masahiko Sawada, reviewed by Kyotaro Horiguchi, Fujii Masao, Amit Kapila, Simon Riggs, Andres Freund, and others, and substantially revised by me.	2016-03-01 21:49:41 -05:00
Andrew Dunstan	87cc6b57a9	Respect TEMP_CONFIG when pg_regress_check and friends are called This reverts commit `9117985b6b` in favor of a more general solution.	2016-02-27 12:28:21 -05:00
Robert Haas	35746bc348	Add new FDW API to test for parallel-safety. This is basically a bug fix; the old code assumes that a ForeignScan is always parallel-safe, but for postgres_fdw, for example, this is definitely false. It should be true for file_fdw, though, since a worker can read a file from the filesystem just as well as any other backend process. Original patch by Thomas Munro. Documentation, and changes to the comments, by me.	2016-02-26 16:14:46 +05:30
Robert Haas	9117985b6b	Respect TEMP_CONFIG when running contrib regression tests. Thomas Munro	2016-02-26 12:38:21 +05:30
Robert Haas	dd077ef832	postgres_fdw: Avoid sharing list substructure. list_concat(list_concat(a, b), c) destructively changes both a and b; to avoid such perils, copy lists of remote_conds before incorporating them into larger lists via list_concat(). Ashutosh Bapat, per a report from Etsuro Fujita	2016-02-21 14:17:50 +05:30
Tom Lane	19a541143a	Add an explicit representation of the output targetlist to Paths. Up to now, there's been an assumption that all Paths for a given relation compute the same output column set (targetlist). However, there are good reasons to remove that assumption. For example, an indexscan on an expression index might be able to return the value of an expensive function "for free". While we have the ability to generate such a plan today in simple cases, we don't have a way to model that it's cheaper than a plan that computes the function from scratch, nor a way to create such a plan in join cases (where the function computation would normally happen at the topmost join node). Also, we need this so that we can have Paths representing post-scan/join steps, where the targetlist may well change from one step to the next. Therefore, invent a "struct PathTarget" representing the columns we expect a plan step to emit. It's convenient to include the output tuple width and tlist evaluation cost in this struct, and there will likely be additional fields in future. While Path nodes that actually do have custom outputs will need their own PathTargets, it will still be true that most Paths for a given relation will compute the same tlist. To reduce the overhead added by this patch, keep a "default PathTarget" in RelOptInfo, and allow Paths that compute that column set to just point to their parent RelOptInfo's reltarget. (In the patch as committed, actually every Path is like that, since we do not yet have any cases of custom PathTargets.) I took this opportunity to provide some more-honest costing of PlaceHolderVar evaluation. Up to now, the assumption that "scan/join reltargetlists have cost zero" was applied not only to Vars, where it's reasonable, but also PlaceHolderVars where it isn't. Now, we add the eval cost of a PlaceHolderVar's expression to the first plan level where it can be computed, by including it in the PathTarget cost field and adding that to the cost estimates for Paths. This isn't perfect yet but it's much better than before, and there is a way forward to improve it more. This costing change affects the join order chosen for a couple of the regression tests, changing expected row ordering.	2016-02-18 20:02:03 -05:00
Tom Lane	48e6c943e5	Fix multiple bugs in contrib/pgstattuple's pgstatindex() function. Dead or half-dead index leaf pages were incorrectly reported as live, as a consequence of a code rearrangement I made (during a moment of severe brain fade, evidently) in commit `d287818eb5`. The index metapage was not counted in index_size, causing that result to not agree with the actual index size on-disk. Index root pages were not counted in internal_pages, which is inconsistent compared to the case of a root that's also a leaf (one-page index), where the root would be counted in leaf_pages. Aside from that inconsistency, this could lead to additional transient discrepancies between the reported page counts and index_size, since it's possible for pgstatindex's scan to see zero or multiple pages marked as BTP_ROOT, if the root moves due to a split during the scan. With these fixes, index_size will always be exactly one page more than the sum of the displayed page counts. Also, the index_size result was incorrectly documented as being measured in pages; it's always been measured in bytes. (While fixing that, I couldn't resist doing some small additional wordsmithing on the pgstattuple docs.) Including the metapage causes the reported index_size to not be zero for an empty index. To preserve the desired property that the pgstattuple regression test results are platform-independent (ie, BLCKSZ configuration independent), scale the index_size result in the regression tests. The documentation issue was reported by Otsuka Kenji, and the inconsistent root page counting by Peter Geoghegan; the other problems noted by me. Back-patch to all supported branches, because this has been broken for a long time.	2016-02-18 15:40:35 -05:00
Tom Lane	99a9d6d563	Add missing "static" qualifier. Per buildfarm member pademelon.	2016-02-12 11:20:16 -05:00
Robert Haas	019e788137	postgres_fdw: Remove unnecessary variable. It causes warnings in non-Assert-enabled builds. Per report from Jeff Janes.	2016-02-10 08:17:43 -05:00
Robert Haas	bb4df42e6a	postgres_fdw: Remove unstable regression test. Per Tom Lane and the buildfarm.	2016-02-09 15:42:20 -05:00
Robert Haas	e4106b2528	postgres_fdw: Push down joins to remote servers. If we've got a relatively straightforward join between two tables, this pushes that join down to the remote server instead of fetching the rows for each table and performing the join locally. Some cases are not handled yet, such as SEMI and ANTI joins. Also, we don't yet attempt to create presorted join paths or parameterized join paths even though these options do get tried for a base relation scan. Nevertheless, this seems likely to be a very significant win in many practical cases. Shigeru Hanada and Ashutosh Bapat, reviewed by Robert Haas, with additional review at various points by Tom Lane, Etsuro Fujita, KaiGai Kohei, and Jeevan Chalke.	2016-02-09 14:00:50 -05:00
Tom Lane	63828969c8	Use %u not %d to print OIDs. Oversight in commit `96198d94c`. Etsuro Fujita	2016-02-08 11:06:23 -05:00
Tom Lane	392998bc58	Add missing "static" qualifier. Per buildfarm member pademelon.	2016-02-06 12:21:14 -05:00
Robert Haas	d0cd7bda97	postgres_fdw: pgindent run. In preparation for upcoming commits.	2016-02-04 22:30:08 -05:00
Robert Haas	37c84570b1	postgres_fdw: Avoid possible misbehavior when RETURNING tableoid column only. deparseReturningList ended up adding up RETURNING NULL to the code, but code elsewhere saw an empty list of attributes and concluded that it should not expect tuples from the remote side. Etsuro Fujita and Robert Haas, reviewed by Thom Brown	2016-02-04 22:27:13 -05:00
Robert Haas	c1772ad922	Change the way that LWLocks for extensions are allocated. The previous RequestAddinLWLocks() method had several disadvantages. First, the locks would be in the main tranche; we've recently decided that it's useful for LWLocks used for separate purposes to have separate tranche IDs. Second, there wasn't any correlation between what code called RequestAddinLWLocks() and what code called LWLockAssign(); when multiple modules are in use, it could become quite difficult to troubleshoot problems where LWLockAssign() ran out of locks. To fix, create a concept of named LWLock tranches which can be used either by extension or by core code. Amit Kapila and Robert Haas	2016-02-04 16:43:04 -05:00
Tom Lane	41d2c081ce	Make hstore_to_jsonb_loose match hstore_to_json_loose on what's a number. Commit `e09996ff8d` removed some ad-hoc code in hstore_to_json_loose that determined whether an hstore value string looked like a number, in favor of calling the JSON parser's is-it-a-number code. However, it neglected the fact that the exact same code appeared in hstore_to_jsonb_loose. This is not a bug, exactly, because the requirements on the two functions are not the same: hstore_to_json_loose must accept only syntactically legal JSON numbers as numbers, or it will produce invalid JSON output, as per bug #12070 which spawned the prior commit. But hstore_to_jsonb_loose could accept anything that numeric_in will eat, other than Inf and NaN. Nonetheless it seems surprising and arbitrary that the two functions don't use the same rules for what is a number versus what is a string; especially since they did use the same rules before the aforesaid commit. For one thing, that means that doing hstore_to_json_loose and then casting to jsonb can produce results different from doing just hstore_to_jsonb_loose. Hence, change hstore_to_jsonb_loose's logic to match hstore_to_json_loose, ie, hstore values are treated as numbers when they match the JSON syntax for numbers. No back-patch, since this is more in the nature of a definitional change than a bug fix.	2016-02-03 12:04:02 -05:00
Robert Haas	52b63649fc	Code review for commit `dc203dc3ac`. Remove duplicate assignment. This part by Ashutosh Bapat. Remove now-obsolete comment. This part by me, although the pending join pushdown patch does something similar, and for the same reason: there's no reason to keep two lists of the things in the fdw_private structure that have to be kept in sync with each other.	2016-02-03 11:53:46 -05:00
Robert Haas	dc203dc3ac	postgres_fdw: Allow fetch_size to be set per-table or per-server. The default fetch size of 100 rows might not be right in every environment, so allow users to configure it. Corey Huinker, reviewed by Kyotaro Horiguchi, Andres Freund, and me.	2016-02-03 09:07:35 -05:00
Tom Lane	e6ecc93a17	Fix IsValidJsonNumber() to notice trailing non-alphanumeric garbage. Commit `e09996ff8d` was one brick shy of a load: it didn't insist that the detected JSON number be the whole of the supplied string. This allowed inputs such as "2016-01-01" to be misdetected as valid JSON numbers. Per bug #13906 from Dmitry Ryabov. In passing, be more wary of zero-length input (I'm not sure this can happen given current callers, but better safe than sorry), and do some minor cosmetic cleanup.	2016-02-03 01:39:48 -05:00
Magnus Hagander	e51ab85cd9	Fix typos in comments Author: Michael Paquier	2016-02-01 11:43:48 +01:00
Robert Haas	cc592c48c5	postgres_fdw: More preliminary refactoring for upcoming join pushdown. The code that generates a complete SQL query for a given foreign relation was repeated in two places, and they didn't quite agree: the EXPLAIN case left out the locking clause. Centralize the code so we get the same behavior everywhere, and adjust calling conventions and which functions are static vs. extern accordingly . Centralize the code so we get the same behavior everywhere, and adjust calling conventions and which functions are static vs. extern accordingly. Ashutosh Bapat, reviewed and slightly adjusted by me.	2016-01-30 10:32:38 -05:00
Tom Lane	7e22470471	Fix incorrect pattern-match processing in psql's \det command. listForeignTables' invocation of processSQLNamePattern did not match up with the other ones that handle potentially-schema-qualified names; it failed to make use of pg_table_is_visible() and also passed the name arguments in the wrong order. Bug seems to have been aboriginal in commit `0d692a0dc9`. It accidentally sort of worked as long as you didn't inquire too closely into the behavior, although the silliness was later exposed by inconsistencies in the test queries added by `59efda3e50` (which I probably should have questioned at the time, but didn't). Per bug #13899 from Reece Hart. Patch by Reece Hart and Tom Lane. Back-patch to all affected branches.	2016-01-29 10:28:02 +01:00
Robert Haas	b88ef201d4	postgres_fdw: Refactor deparsing code for locking clauses. The upcoming patch to allow join pushdown in postgres_fdw needs to use this code multiple times, which requires moving it to deparse.c. That seems like a good idea anyway, so do that now both on general principle and to simplify the future patch. Inspired by a patch by Shigeru Hanada and Ashutosh Bapat, but I did it a little differently than what that patch did.	2016-01-28 16:44:01 -05:00
Robert Haas	2f6b041f76	Add missing quotation mark. This fix accidentally got left out of the previous commit.	2016-01-28 12:21:51 -05:00
Robert Haas	96198d94cb	Avoid multiple foreign server connections when all use same user mapping. Previously, postgres_fdw's connection cache was keyed by user OID and server OID, but this can lead to multiple connections when it's not really necessary. In particular, if all relevant users are mapped to the public user mapping, then their connection options are certainly the same, so one connection can be used for all of them. While we're cleaning things up here, drop the "server" argument to GetConnection(), which isn't really needed. This saves a few cycles because callers no longer have to look this up; the function itself does, but only when establishing a new connection, not when reusing an existing one. Ashutosh Bapat, with a few small changes by me.	2016-01-28 12:05:19 -05:00
Tom Lane	a396144ac0	Remove new coupling between NAMEDATALEN and MAX_LEVENSHTEIN_STRLEN. Commit `e529cd4ffa` introduced an Assert requiring NAMEDATALEN to be less than MAX_LEVENSHTEIN_STRLEN, which has been 255 for a long time. Since up to that instant we had always allowed NAMEDATALEN to be substantially more than that, this was ill-advised. It's debatable whether we need MAX_LEVENSHTEIN_STRLEN at all (versus putting a CHECK_FOR_INTERRUPTS into the loop), or whether it has to be so tight; but this patch takes the narrower approach of just not applying the MAX_LEVENSHTEIN_STRLEN limit to calls from the parser. Trusting the parser for this seems reasonable, first because the strings are limited to NAMEDATALEN which is unlikely to be hugely more than 256, and second because the maximum distance is tightly constrained by MAX_FUZZY_DISTANCE (though we'd forgotten to make use of that limit in one place). That means the cost is not really O(mn) but more like O(max(m,n)). Relaxing the limit for user-supplied calls is left for future research; given the lack of complaints to date, it doesn't seem very high priority. In passing, fix confusion between lengths-in-bytes and lengths-in-chars in comments and error messages. Per gripe from Kevin Day; solution suggested by Robert Haas. Back-patch to 9.5 where the unwanted restriction was introduced.	2016-01-22 11:53:06 -05:00
Tom Lane	dbe2328959	Fix assorted inconsistencies in GIN opclass support function declarations. GIN had some minor issues too, mostly using "internal" where something else would be more appropriate. I went with the same approach as in `9ff60273e3`, namely preferring the opclass' indexed datatype for arguments that receive an operator RHS value, even if that's not necessarily what they really are. Again, this is with an eye to having a uniform rule for ginvalidate() to check support function signatures.	2016-01-19 22:32:22 -05:00
Tom Lane	9ff60273e3	Fix assorted inconsistencies in GiST opclass support function declarations. The conventions specified by the GiST SGML documentation were widely ignored. For example, the strategy-number argument for "consistent" and "distance" functions is specified to be a smallint, but most of the built-in support functions declared it as an integer, and for that matter the core code passed it using Int32GetDatum not Int16GetDatum. None of that makes any real difference at runtime, but it's quite confusing for newcomers to the code, and it makes it very hard to write an amvalidate() function that checks support function signatures. So let's try to instill some consistency here. Another similar issue is that the "query" argument is not of a single well-defined type, but could have different types depending on the strategy (corresponding to search operators with different righthand-side argument types). Some of the functions threw up their hands and declared the query argument as being of "internal" type, which surely isn't right ("any" would have been more appropriate); but the majority position seemed to be to declare it as being of the indexed data type, corresponding to a search operator with both input types the same. So I've specified a convention that that's what to do always. Also, the result of the "union" support function actually must be of the index's storage type, but the documentation suggested declaring it to return "internal", and some of the functions followed that. Standardize on telling the truth, instead. Similarly, standardize on declaring the "same" function's inputs as being of the storage type, not "internal". Also, somebody had forgotten to add the "recheck" argument to both the documentation of the "distance" support function and all of their SQL declarations, even though the C code was happily using that argument. Clean that up too. Fix up some other omissions in the docs too, such as documenting that union's second input argument is vestigial. So far as the errors in core function declarations go, we can just fix pg_proc.h and bump catversion. Adjusting the erroneous declarations in contrib modules is more debatable: in principle any change in those scripts should involve an extension version bump, which is a pain. However, since these changes are purely cosmetic and make no functional difference, I think we can get away without doing that.	2016-01-19 12:04:36 -05:00
Tom Lane	65c5fcd353	Restructure index access method API to hide most of it at the C level. This patch reduces pg_am to just two columns, a name and a handler function. All the data formerly obtained from pg_am is now provided in a C struct returned by the handler function. This is similar to the designs we've adopted for FDWs and tablesample methods. There are multiple advantages. For one, the index AM's support functions are now simple C functions, making them faster to call and much less error-prone, since the C compiler can now check function signatures. For another, this will make it far more practical to define index access methods in installable extensions. A disadvantage is that SQL-level code can no longer see attributes of index AMs; in particular, some of the crosschecks in the opr_sanity regression test are no longer possible from SQL. We've addressed that by adding a facility for the index AM to perform such checks instead. (Much more could be done in that line, but for now we're content if the amvalidate functions more or less replace what opr_sanity used to do.) We might also want to expose some sort of reporting functionality, but this patch doesn't do that. Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily editorialized on by me.	2016-01-17 19:36:59 -05:00

1 2 3 4 5 ...

3149 Commits