postgresql

Commit Graph

Author	SHA1	Message	Date
Noah Misch	5976097c0f	Prevent stack overflow in query-type functions. The tsquery, ltxtquery and query_int data types have a common ancestor. Having acquired check_stack_depth() calls independently, each was missing at least one call. Back-patch to 9.0 (all supported versions).	2015-10-05 10:06:30 -04:00
Noah Misch	1d812c8b05	pgcrypto: Detect and report too-short crypt() salts. Certain short salts crashed the backend or disclosed a few bytes of backend memory. For existing salt-induced error conditions, emit a message saying as much. Back-patch to 9.0 (all supported versions). Josh Kupershmidt Security: CVE-2015-5288	2015-10-05 10:06:29 -04:00
Tom Lane	8bbe4cbd9b	Improve contrib/pg_stat_statements' handling of garbage collection failure. If we can't read the query texts file (whether because out-of-memory, or for some other reason), give up and reset the file to empty, discarding all stored query texts, though not the statistics per se. We used to leave things alone and hope for better luck next time, but the problem is that the file is only going to get bigger and even harder to slurp into memory. Better to do something that will get us out of trouble. Likewise reset the file to empty for any other failure within gc_qtexts(). The previous behavior after a write error was to discard query texts but not do anything to truncate the file, which is just weird. Also, increase the maximum supported file size from MaxAllocSize to MaxAllocHugeSize; this makes it more likely we'll be able to do a garbage collection successfully. Also, fix recalculation of mean_query_len within entry_dealloc() to match the calculation in gc_qtexts(). The previous coding overlooked the possibility of dropped texts (query_len == -1) and would underestimate the mean of the remaining entries in such cases, thus possibly causing excess garbage collection cycles. In passing, add some errdetail to the log entry that complains about insufficient memory to read the query texts file, which after all was Jim Nasby's original complaint. Back-patch to 9.4 where the current handling of query texts was introduced. Peter Geoghegan, rather editorialized upon by me	2015-10-04 17:58:42 -04:00
Andres Freund	23fc0b485d	Add missed CREATE EXTENSION ... CASCADE regression test adjustment.	2015-10-03 21:31:51 +02:00
Andres Freund	b67aaf21e8	Add CASCADE support for CREATE EXTENSION. Without CASCADE, if an extension has an unfullfilled dependency on another extension, CREATE EXTENSION ERRORs out with "required extension ... is not installed". That is annoying, especially when that dependency is an implementation detail of the extension, rather than something the extension's user can make sense of. In addition to CASCADE this also includes a small set of regression tests around CREATE EXTENSION. Author: Petr Jelinek, editorialized by Michael Paquier, Andres Freund Reviewed-By: Michael Paquier, Andres Freund, Jeff Janes Discussion: 557E0520.3040800@2ndquadrant.com	2015-10-03 18:23:40 +02:00
Andres Freund	920218cbc0	Improve errhint() about replication slot naming restrictions. The existing hint talked about "may only contain letters", but the actual requirement is more strict: only lower case letters are allowed. Reported-By: Rushabh Lathia Author: Rushabh Lathia Discussion: AGPqQf2x50qcwbYOBKzb4x75sO_V3g81ZsA8+Ji9iN5t_khFhQ@mail.gmail.com Backpatch: 9.4-, where replication slots were added	2015-10-03 15:29:08 +02:00
Tom Lane	76f965ff1f	Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit `ed3ddf918b`, so back-patch to 9.3 where that logic appeared.	2015-09-24 12:47:29 -04:00
Andres Freund	eef34e5236	test_decoding: Protect against rare spurious test failures. A bunch of tests missed specifying that empty transactions shouldn't be displayed. That causes problems when e.g. autovacuum runs in an unfortunate moment. The tests in question only run for a very short time, making this quite unlikely. Reported-By: Buildfarm member axolotl Backpatch: 9.4, where logical decoding was introduced	2015-09-22 15:39:46 +02:00
Alvaro Herrera	665a00c9e2	Fix error message wording in previous sslinfo commit	2015-09-08 11:10:20 -03:00
Alvaro Herrera	49124613f1	contrib/sslinfo: add ssl_extension_info SRF This new function provides information about SSL extensions present in the X509 certificate used for the current connection. Extension version updated to version 1.1. Author: Дмитрий Воронин (Dmitry Voronin) Reviewed by: Michael Paquier, Heikki Linnakangas, Álvaro Herrera	2015-09-07 21:24:17 -03:00
Alvaro Herrera	d94c36a45a	Add more sanity checks in contrib/sslinfo We were missing a few return checks on OpenSSL calls. Should be pretty harmless, since we haven't seen any user reports about problems, and this is not a high-traffic module anyway; still, a bug is a bug, so backpatch this all the way back to 9.0. Author: Michael Paquier, while reviewing another sslinfo patch	2015-09-07 19:18:29 -03:00
Joe Conway	03543afe15	Adjust sepgsql regression output for recent error context change Recent commit `0426f349e` changed handling of error context reports in such a way to have a minor effect on the sepgsql regression output. Adapt the expected output file to suit. Since that commit was HEAD only, so is this one.	2015-09-06 11:25:36 -07:00
Tom Lane	0426f349ef	Rearrange the handling of error context reports. Remove the code in plpgsql that suppressed the innermost line of CONTEXT for messages emitted by RAISE commands. That was never more than a quick backwards-compatibility hack, and it's pretty silly in cases where the RAISE is nested in several levels of function. What's more, it violated our design theory that verbosity of error reports should be controlled on the client side not the server side. To alleviate the resulting noise increase, introduce a feature in libpq and psql whereby the CONTEXT field of messages can be suppressed, either always or only for non-error messages. Printing CONTEXT for errors only is now their default behavior. The actual code changes here are pretty small, but the effects on the regression test outputs are widespread. I had to edit some of the alternative expected outputs by hand; hopefully the buildfarm will soon find anything I fat-fingered. In passing, fix up (again) the output line counts in psql's various help displays. Add some commentary about how to verify them. Pavel Stehule, reviewed by Petr Jelínek, Jeevan Chalke, and others	2015-09-05 11:58:33 -04:00
Heikki Linnakangas	c80b5f66c6	Fix misc typos. Oskari Saarenmaa. Backpatch to stable branches where applicable.	2015-09-05 11:35:49 +03:00
Teodor Sigaev	1bbd52cb9a	Make unaccent handle all diacritics known to Unicode, and expand ligatures correctly Add Python script for buiding unaccent.rules from Unicode data. Don't backpatch because unaccent changes may require tsvector/index rebuild. Thomas Munro <thomas.munro@enterprisedb.com>	2015-09-04 12:51:53 +03:00
Joe Conway	794e2558be	Fix sepgsql regression tests. The regression tests for sepgsql were broken by changes in the base distro as-shipped policies. Specifically, definition of unconfined_t in the system default policy was changed to bypass multi-category rules, which the regression test depended on. Fix that by defining a custom privileged domain (sepgsql_regtest_superuser_t) and using it instead of system's unconfined_t domain. The new sepgsql_regtest_superuser_t domain performs almost like the current unconfined_t, but restricted by multi-category policy as the traditional unconfined_t was. The custom policy module is a self defined domain, and so should not be affected by related future system policy changes. However, it still uses the unconfined_u:unconfined_r pair for selinux-user and role. Those definitions have not been changed for several years and seem less risky to rely on than the unconfined_t domain. Additionally, if we define custom user/role, they would need to be manually defined at the operating system level, adding more complexity to an already non-standard and complex regression test. Back-patch to 9.3. The regression tests will need more work before working correctly on 9.2. Starting with 9.2, sepgsql has had dependencies on libselinux versions that are only available on newer distros with the changed set of policies (e.g. RHEL 7.x). On 9.1 sepgsql works fine with the older distros with original policy set (e.g. RHEL 6.x), and on which the existing regression tests work fine. We might want eventually change 9.1 sepgsql regression tests to be more independent from the underlying OS policies, however more work will be needed to make that happen and it is not clear that it is worth the effort. Kohei KaiGai with review by Adam Brightwell and me, commentary by Stephen, Alvaro, Tom, Robert, and others.	2015-08-30 11:09:05 -07:00
Peter Eisentraut	6103b3f368	Improve spelling	2015-08-22 21:54:35 -04:00
Andres Freund	e95126cf04	Don't use function definitions looking like old-style ones. This fixes a bunch of somewhat pedantic warnings with new compilers. Since by far the majority of other functions definitions use the (void) style it just seems to be consistent to do so as well in the remaining few places.	2015-08-15 17:25:00 +02:00
Robert Haas	5243a35825	Remove bogus step from test_decoding isolation tests. Commit `43b4a16817` made the isolation tester reject duplicate step names, and it turns out that the test_decoding module's concurrent_ddl_dml isolation test has a duplicate name. I think the second definition isn't actually getting used, so just remove it. Per buildfarm.	2015-08-14 22:40:55 -04:00
Alvaro Herrera	94d626ff5a	Use materialize SRF mode in brin_page_items This function was using the single-value-per-call mechanism, but the code relied on a relcache entry that wasn't kept open across calls. This manifested as weird errors in buildfarm during the short time that the "brin-1" isolation test lived. Backpatch to 9.5, where it was introduced.	2015-08-13 13:02:10 -03:00
Andres Freund	3f811c2d6f	Add confirmed_flush column to pg_replication_slots. There's no reason not to expose both restart_lsn and confirmed_flush since they have rather distinct meanings. The former is the oldest WAL still required and valid for both physical and logical slots, whereas the latter is the location up to which a logical slot's consumer has confirmed receiving data. Most of the time a slot will require older WAL (i.e. restart_lsn) than the confirmed position (i.e. confirmed_flush_lsn). Author: Marko Tiikkaja, editorialized by me Discussion: 559D110B.1020109@joh.to	2015-08-10 13:28:18 +02:00
Tom Lane	fd7ed26c1a	contrib/isn now needs a .gitignore file. Oversight in commit `cb3384a0cb`. Back-patch to 9.1, like that commit.	2015-08-02 23:57:32 -04:00
Tom Lane	09cecdf285	Fix a number of places that produced XX000 errors in the regression tests. It's against project policy to use elog() for user-facing errors, or to omit an errcode() selection for errors that aren't supposed to be "can't happen" cases. Fix all the violations of this policy that result in ERRCODE_INTERNAL_ERROR log entries during the standard regression tests, as errors that can reliably be triggered from SQL surely should be considered user-facing. I also looked through all the files touched by this commit and fixed other nearby problems of the same ilk. I do not claim to have fixed all violations of the policy, just the ones in these files. In a few places I also changed existing ERRCODE choices that didn't seem particularly appropriate; mainly replacing ERRCODE_SYNTAX_ERROR by something more specific. Back-patch to 9.5, but no further; changing ERRCODE assignments in stable branches doesn't seem like a good idea.	2015-08-02 23:49:19 -04:00
Heikki Linnakangas	cb3384a0cb	Fix output of ISBN-13 numbers beginning with 979. An EAN beginning with 979 (but not 9790 - those are ISMN's) are accepted as ISBN numbers, but they cannot be represented in the old, 10-digit ISBN format. They must be output in the new 13-digit ISBN-13 format. We printed out an incorrect value for those. Also add a regression test, to test this and some other basic functionality of the module. Patch by Fabien Coelho. This fixes bug #13442, reported by B.Z. Backpatch to 9.1, where we started to recognize ISBN-13 numbers.	2015-08-02 22:12:33 +03:00
Tom Lane	c879d51c59	Some platforms now need contrib/tsm_system_time to be linked with libm. Buildfarm member hornet, at least, seems to want -lm in the link command. Probably this is due to the just-added use of isnan().	2015-07-25 16:37:12 -04:00
Tom Lane	dd7a8f66ed	Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.	2015-07-25 14:39:00 -04:00
Heikki Linnakangas	c6fbe6d6fb	Add selectivity estimation functions for intarray operators. Uriy Zhuravlev and Alexander Korotkov, reviewed by Jeff Janes, some cleanup by me.	2015-07-21 20:59:24 +03:00
Teodor Sigaev	97f3014647	This supports the triconsistent function for pg_trgm GIN opclass to make it faster to implement indexed queries where some keys are common and some are rare. Patch by Jeff Janes	2015-07-20 18:18:48 +03:00
Andrew Dunstan	00eff86cb8	Enable transforms modules to build and test on Cygwin. This still doesn't work correctly with Python 3, but I am committing this so we can get Cygwin buildfarm members building with Python 2.	2015-07-18 10:09:04 -04:00
Noah Misch	7193436744	AIX: Link TRANSFORM modules with their dependencies. The result closely resembles linking of these modules for the "win32" port. Augment the $(exports_file) header so the file is also usable as an import file. Unfortunately, relocating an AIX installation will now require adding $(pkglibdir) to LD_LIBRARY_PATH. Back-patch to 9.5, where the modules were introduced.	2015-07-15 21:00:26 -04:00
Noah Misch	736c1f238b	MinGW: Link ltree_plpython with plpython. The MSVC build system already did this, and building against Python 3 requires it. Back-patch to 9.5, where the module was introduced.	2015-07-15 21:00:26 -04:00
Fujii Masao	705d397cd9	Prevent pgstattuple() from reporting BRIN as unknown index. Also this patch removes obsolete comment. Back-patch to 9.5 where BRIN index was added.	2015-07-14 22:36:51 +09:00
Noah Misch	0689cfc34b	Link pg_stat_statements with libm. The AIX 7.1 libm is static, and AIX postgres executables do not export symbols acquired from libraries. Back-patch to 9.5, where commit `cfe12763c3` added a sqrt() call.	2015-07-08 20:44:22 -04:00
Tom Lane	10fb48d66d	Add an optional missing_ok argument to SQL function current_setting(). This allows convenient checking for existence of a GUC from SQL, which is particularly useful when dealing with custom variables. David Christensen, reviewed by Jeevan Chalke	2015-07-02 16:41:07 -04:00
Heikki Linnakangas	f92d6a540a	Use appendStringInfoString/Char et al where appropriate. Patch by David Rowley. Backpatch to 9.5, as some of the calls were new in 9.5, and keeping the code in sync with master makes future backpatching easier.	2015-07-02 12:36:03 +03:00
Fujii Masao	fb174687f7	Make use of xlog_internal.h's macros in WAL-related utilities. Commit `179cdd09` added macros to check if a filename is a WAL segment or other such file. However there were still some instances of the strlen + strspn combination to check for that in WAL-related utilities like pg_archivecleanup. Those checks can be replaced with the macros. This patch makes use of the macros in those utilities and which would make the code a bit easier to read. Back-patch to 9.5. Michael Paquier	2015-07-02 10:35:38 +09:00
Andres Freund	d47a1136e4	Fix test_decoding's handling of nonexistant columns in old tuple versions. test_decoding used fastgetattr() to extract column values. That's wrong when decoding updates and deletes if a table's replica identity is set to FULL and new columns have been added since the old version of the tuple was created. Due to the lack of a crosscheck with the datum's natts values an invalid value will be output, leading to errors or worse. Bug: #13470 Reported-By: Krzysztof Kotlarski Discussion: 20150626100333.3874.90852@wrigleys.postgresql.org Backpatch to 9.4, where the feature, including the bug, was added.	2015-06-27 19:00:45 +02:00
Peter Eisentraut	75f9d17638	Make Python tests more portable Newer Python versions randomize the hash seed for dictionaries, resulting in a random output order, which messes up the regression test diffs. Instead, use Python assert to compare the dictionaries with their expected value.	2015-05-31 07:10:45 -04:00
Stephen Frost	cde9cf170c	Finish removing pg_audit	2015-05-28 12:48:25 -04:00
Stephen Frost	e5f1a4f1e3	Remove pg_audit This removes pg_audit, per discussion: 20150528082038.GU26667@tamriel.snowman.net	2015-05-28 12:41:26 -04:00
Tom Lane	2aa0476dc3	Manual cleanup of pgindent results. Fix some places where pgindent did silly stuff, often because project style wasn't followed to begin with. (I've not touched the atomics headers, though.)	2015-05-24 15:04:10 -04:00
Tom Lane	91e79260f6	Remove no-longer-required function declarations. Remove a bunch of "extern Datum foo(PG_FUNCTION_ARGS);" declarations that are no longer needed now that PG_FUNCTION_INFO_V1(foo) provides that. Some of these were evidently missed in commit `e7128e8dbb`, but others were cargo-culted in in code added since then. Possibly that can be blamed in part on the fact that we'd not fixed relevant documentation examples, which I've now done.	2015-05-24 12:20:23 -04:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Heikki Linnakangas	4fc72cc7bb	Collection of typo fixes. Use "a" and "an" correctly, mostly in comments. Two error messages were also fixed (they were just elogs, so no translation work required). Two function comments in pg_proc.h were also fixed. Etsuro Fujita reported one of these, but I found a lot more with grep. Also fix a few other typos spotted while grepping for the a/an typos. For example, "consists out of ..." -> "consists of ...". Plus a "though"/ "through" mixup reported by Euler Taveira. Many of these typos were in old code, which would be nice to backpatch to make future backpatching easier. But much of the code was new, and I didn't feel like crafting separate patches for each branch. So no backpatching.	2015-05-20 16:56:22 +03:00
Andres Freund	0740cbd759	Refactor ON CONFLICT index inference parse tree representation. Defer lookup of opfamily and input type of a of a user specified opclass until the optimizer selects among available unique indexes; and store the opclass in the parse analyzed tree instead. The primary reason for doing this is that for rule deparsing it's easier to use the opclass than the previous representation. While at it also rename a variable in the inference code to better fit it's purpose. This is separate from the actual fixes for deparsing to make review easier.	2015-05-19 21:21:27 +02:00
Peter Eisentraut	0779f2ba2d	Fix parse tree of DROP TRANSFORM and COMMENT ON TRANSFORM The plain C string language name needs to be wrapped in makeString() so that the parse tree is copyable. This is detectable by -DCOPY_PARSE_PLAN_TREES. Add a test case for the COMMENT case. Also make the quoting in the error messages more consistent. discovered by Tom Lane	2015-05-18 22:55:14 -04:00
Noah Misch	85270ac7a2	pgcrypto: Report errant decryption as "Wrong key or corrupt data". This has been the predominant outcome. When the output of decrypting with a wrong key coincidentally resembled an OpenPGP packet header, pgcrypto could instead report "Corrupt data", "Not text data" or "Unsupported compression algorithm". The distinct "Corrupt data" message added no value. The latter two error messages misled when the decrypted payload also exhibited fundamental integrity problems. Worse, error message variance in other systems has enabled cryptologic attacks; see RFC 4880 section "14. Security Considerations". Whether these pgcrypto behaviors are likewise exploitable is unknown. In passing, document that pgcrypto does not resist side-channel attacks. Back-patch to 9.0 (all supported versions). Security: CVE-2015-3167	2015-05-18 10:02:31 -04:00
Tom Lane	b14cf229f4	Use += not = to set makefile variables after including base makefiles. The previous coding in hstore_plpython and ltree_plpython wiped out any values set by the base makefiles. This at least had the effect of running the tests in "regression" not "contrib_regression" as expected. These being pretty new modules, there might be other bad effects we'd not noticed yet.	2015-05-17 20:04:42 -04:00
Stephen Frost	a936743b33	pg_audit Makefile, REINDEX changes Clean up the Makefile, per Michael Paquier. Classify REINDEX as we do in core, use '1.0' for the version, per Fujii.	2015-05-17 09:56:57 -04:00
Magnus Hagander	3b075e9d7b	Fix typos in comments Dmitriy Olshevskiy	2015-05-17 14:58:04 +02:00
Peter Eisentraut	fab6ca23ea	hstore_plpython: Fix regression tests under Python 3	2015-05-16 23:35:29 -04:00
Peter Eisentraut	e6dc503445	Fix whitespace	2015-05-16 20:43:32 -04:00
Andres Freund	f3d3118532	Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com	2015-05-16 03:46:31 +02:00
Alvaro Herrera	26df7066cc	Move strategy numbers to include/access/stratnum.h For upcoming BRIN opclasses, it's convenient to have strategy numbers defined in a single place. Since there's nothing appropriate, create it. The StrategyNumber typedef now lives there, as well as existing strategy numbers for B-trees (from skey.h) and R-tree-and-friends (from gist.h). skey.h is forced to include stratnum.h because of the StrategyNumber typedef, but gist.h is not; extensions that currently rely on gist.h for rtree strategy numbers might need to add a new A few .c files can stop including skey.h and/or gist.h, which is a nice side benefit. Per discussion: https://www.postgresql.org/message-id/20150514232132.GZ2523@alvh.no-ip.org Authored by Emre Hasegeli and Álvaro. (It's not clear to me why bootscanner.l has any #include lines at all.)	2015-05-15 17:03:16 -03:00
Simon Riggs	df259759fb	Add to contrib/Makefile	2015-05-15 15:33:37 -04:00
Simon Riggs	56e121a508	contrib/tsm_system_time	2015-05-15 15:31:50 -04:00
Simon Riggs	4d40494b11	contrib/tsm_system_rows	2015-05-15 15:31:14 -04:00
Simon Riggs	f6d208d6e5	TABLESAMPLE, SQL Standard and extensible Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use cases for the sampling API follow in later commits. Petr Jelinek Reviewed by Michael Paquier and Simon Riggs	2015-05-15 14:37:10 -04:00
Stephen Frost	aff27e3379	Remove useless pg_audit.conf No need to have pg_audit.conf any longer since the regression tests are just loading the module at the start of each session (to simulate being in shared_preload_libraries, which isn't something we can actually make happen on the buildfarm itself, it seems). Pointed out by Tom	2015-05-15 10:41:53 -04:00
Simon Riggs	83e176ec18	Separate block sampling functions Refactoring ahead of tablesample patch Requested and reviewed by Michael Paquier Petr Jelinek	2015-05-15 04:02:54 +02:00
Stephen Frost	b22b770683	Make repeated 'make installcheck' runs work In pg_audit, set client_min_messages up to warning, then reset the role attributes, to completely reset the session while not making the regression tests depend on being run by any particular user.	2015-05-14 15:41:39 -04:00
Stephen Frost	ed6ea8e815	Improve pg_audit regression tests Instead of creating a new superuser role, extract out what the current user is and use that user instead. Further, clean up and drop all objects created by the regression test. Pointed out by Tom.	2015-05-14 15:16:27 -04:00
Tom Lane	35a1e1d159	Fix portability issue in pg_audit. "%ld" is not a portable way to print int64's. This may explain the buildfarm crashes we're seeing --- it seems to make dromedary happy, at least.	2015-05-14 13:19:26 -04:00
Tom Lane	6c9e93d3ff	Suppress uninitialized-variable warning.	2015-05-14 12:16:06 -04:00
Stephen Frost	8a2e1edd2b	Further fixes for the buildfarm for pg_audit Also, use a function to load the extension ahead of all other calls, simulating load from shared_libraries_preload, to make sure the hooks are in place before logging start.	2015-05-14 11:55:36 -04:00
Stephen Frost	c703b1e689	Further fixes for the buildfarm for pg_audit The database built by the buildfarm is specific to the extension, use \connect - instead.	2015-05-14 11:44:16 -04:00
Stephen Frost	dfb7624a13	Fix buildfarm with regard to pg_audit Remove the check that pg_audit be installed by shared_preload_libraries as that's not going to work when running the regressions tests in the buildfarm. That check was primairly a nice to have and isn't required anyway.	2015-05-14 10:57:12 -04:00
Stephen Frost	ac52bb0442	Add pg_audit, an auditing extension This extension provides detailed logging classes, ability to control logging at a per-object level, and includes fully-qualified object names for logged statements (DML and DDL) in independent fields of the log output. Authors: Ian Barwick, Abhijit Menon-Sen, David Steele Reviews by: Robert Haas, Tatsuo Ishii, Sawada Masahiko, Fujii Masao, Simon Riggs Discussion with: Josh Berkus, Jaime Casanova, Peter Eisentraut, David Fetter, Yeb Havinga, Alvaro Herrera, Petr Jelinek, Tom Lane, MauMau, Bruce Momjian, Jim Nasby, Michael Paquier, Fabrízio de Royes Mello, Neil Tiffin	2015-05-14 10:36:16 -04:00
Tom Lane	0bb8528b5c	Fix postgres_fdw to return the right ctid value in EvalPlanQual cases. If a postgres_fdw foreign table is a non-locked source relation in an UPDATE, DELETE, or SELECT FOR UPDATE/SHARE, and the query selects its ctid column, the wrong value would be returned if an EvalPlanQual recheck occurred. This happened because the foreign table's result row was copied via the ROW_MARK_COPY code path, and EvalPlanQualFetchRowMarks just unconditionally set the reconstructed tuple's t_self to "invalid". To fix that, we can have EvalPlanQualFetchRowMarks copy the composite datum's t_ctid field, and be sure to initialize that along with t_self when postgres_fdw constructs a tuple to return. If we just did that much then EvalPlanQualFetchRowMarks would start returning "(0,0)" as ctid for all other ROW_MARK_COPY cases, which perhaps does not matter much, but then again maybe it might. The cause of that is that heap_form_tuple, which is the ultimate source of all composite datums, simply leaves t_ctid as zeroes in newly constructed tuples. That seems like a bad idea on general principles: a field that's really not been initialized shouldn't appear to have a valid value. So let's eat the trivial additional overhead of doing "ItemPointerSetInvalid(&(td->t_ctid))" in heap_form_tuple. This closes out our handling of Etsuro Fujita's report that tableoid and ctid weren't correctly set in postgres_fdw EvalPlanQual cases. Along the way we did a great deal of work to improve FDWs' ability to control row locking behavior; which was not wasted effort by any means, but it didn't end up being a fix for this problem because that feature would be too expensive for postgres_fdw to use all the time. Although the fix for the tableoid misbehavior was back-patched, I'm hesitant to do so here; it seems far less likely that people would care about remote ctid than tableoid, and even such a minor behavioral change as this in heap_form_tuple is perhaps best not back-patched. So commit to HEAD only, at least for the moment. Etsuro Fujita, with some adjustments by me	2015-05-13 14:05:29 -04:00
Andres Freund	5850b20f58	Add pgstattuple_approx() to the pgstattuple extension. The new function allows to estimate bloat and other table level statics in a faster, but approximate, way. It does so by using information from the free space map for pages marked as all visible in the visibility map. The rest of the table is actually read and free space/bloat is measured accurately. In many cases that allows to get bloat information much quicker, causing less IO. Author: Abhijit Menon-Sen Reviewed-By: Andres Freund, Amit Kapila and Tomas Vondra Discussion: 20140402214144.GA28681@kea.toroid.org	2015-05-13 07:35:06 +02:00
Peter Eisentraut	d02f16470f	Replace some appendStringInfo* calls with more appropriate variants Author: David Rowley <dgrowleyml@gmail.com>	2015-05-11 20:38:55 -04:00
Tom Lane	1a8a4e5cde	Code review for foreign/custom join pushdown patch. Commit `e7cb7ee145` included some design decisions that seem pretty questionable to me, and there was quite a lot of stuff not to like about the documentation and comments. Clean up as follows: * Consider foreign joins only between foreign tables on the same server, rather than between any two foreign tables with the same underlying FDW handler function. In most if not all cases, the FDW would simply have had to apply the same-server restriction itself (far more expensively, both for lack of caching and because it would be repeated for each combination of input sub-joins), or else risk nasty bugs. Anyone who's really intent on doing something outside this restriction can always use the set_join_pathlist_hook. * Rename fdw_ps_tlist/custom_ps_tlist to fdw_scan_tlist/custom_scan_tlist to better reflect what they're for, and allow these custom scan tlists to be used even for base relations. * Change make_foreignscan() API to include passing the fdw_scan_tlist value, since the FDW is required to set that. Backwards compatibility doesn't seem like an adequate reason to expect FDWs to set it in some ad-hoc extra step, and anyway existing FDWs can just pass NIL. * Change the API of path-generating subroutines of add_paths_to_joinrel, and in particular that of GetForeignJoinPaths and set_join_pathlist_hook, so that various less-used parameters are passed in a struct rather than as separate parameter-list entries. The objective here is to reduce the probability that future additions to those parameter lists will result in source-level API breaks for users of these hooks. It's possible that this is even a small win for the core code, since most CPU architectures can't pass more than half a dozen parameters efficiently anyway. I kept root, joinrel, outerrel, innerrel, and jointype as separate parameters to reduce code churn in joinpath.c --- in particular, putting jointype into the struct would have been problematic because of the subroutines' habit of changing their local copies of that variable. * Avoid ad-hocery in ExecAssignScanProjectionInfo. It was probably all right for it to know about IndexOnlyScan, but if the list is to grow we should refactor the knowledge out to the callers. * Restore nodeForeignscan.c's previous use of the relcache to avoid extra GetFdwRoutine lookups for base-relation scans. * Lots of cleanup of documentation and missed comments. Re-order some code additions into more logical places.	2015-05-10 14:36:36 -04:00
Andrew Dunstan	0c90f6769d	Add new OID alias type regrole The new type has the scope of whole the database cluster so it doesn't behave the same as the existing OID alias types which have database scope, concerning object dependency. To avoid confusion constants of the new type are prohibited from appearing where dependencies are made involving it. Also, add a note to the docs about possible MVCC violation and optimization issues, which are general over the all reg* types. Kyotaro Horiguchi	2015-05-09 13:06:49 -04:00
Andres Freund	581f4f9657	Remove dependency on ordering in logical decoding upsert test. Buildfarm member magpie sorted the output differently than intended by Peter. "Resolve" the problem by simply not aggregating, it's not that many lines.	2015-05-08 06:06:03 +02:00
Andres Freund	168d5805e4	Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.	2015-05-08 05:43:10 +02:00
Andres Freund	2c8f4836db	Represent columns requiring insert and update privileges indentently. Previously, relation range table entries used a single Bitmapset field representing which columns required either UPDATE or INSERT privileges, despite the fact that INSERT and UPDATE privileges are separately cataloged, and may be independently held. As statements so far required either insert or update privileges but never both, that was sufficient. The required permission could be inferred from the top level statement run. The upcoming INSERT ... ON CONFLICT UPDATE feature needs to independently check for both privileges in one statement though, so that is not sufficient anymore. Bumps catversion as stored rules change. Author: Peter Geoghegan Reviewed-By: Andres Freund	2015-05-08 00:20:46 +02:00
Alvaro Herrera	db5f98ab4f	Improve BRIN infra, minmax opclass and regression test The minmax opclass was using the wrong support functions when cross-datatypes queries were run. Instead of trying to fix the pg_amproc definitions (which apparently is not possible), use the already correct pg_amop entries instead. This requires jumping through more hoops (read: extra syscache lookups) to obtain the underlying functions to execute, but it is necessary for correctness. Author: Emre Hasegeli, tweaked by Álvaro Review: Andreas Karlsson Also change BrinOpcInfo to record each stored type's typecache entry instead of just the OID. Turns out that the full type cache is necessary in brin_deform_tuple: the original code used the indexed type's byval and typlen properties to extract the stored tuple, which is correct in Minmax; but in other implementations that want to store something different, that's wrong. The realization that this is a bug comes from Emre also, but I did not use his patch. I also adopted Emre's regression test code (with smallish changes), which is more complete.	2015-05-07 13:02:22 -03:00
Tom Lane	b22527f29d	Fix incorrect declaration of citext's regexp_matches() functions. These functions should return SETOF TEXT[], like the core functions they are wrappers for; but they were incorrectly declared as returning just TEXT[]. This mistake had two results: first, if there was no match you got a scalar null result, whereas what you should get is an empty set (zero rows). Second, the 'g' flag was effectively ignored, since you would get only one result array even if there were multiple matches, as reported by Jeff Certain. While ignoring 'g' is a clear bug, the behavior for no matches might well have been thought to be the intended behavior by people who hadn't compared it carefully to the core regexp_matches() functions. So we should tread carefully about introducing this change in the back branches. Still, it clearly is a bug and so providing some fix is desirable. After discussion, the conclusion was to introduce the change in a 1.1 version of the citext extension (as we would need to do anyway); 1.0 still contains the incorrect behavior. 1.1 is the default and only available version in HEAD, but it is optional in the back branches, where 1.0 remains the default version. People wishing to adopt the fix in back branches will need to explicitly do ALTER EXTENSION citext UPDATE TO '1.1'. (I also provided a downgrade script in the back branches, so people could go back to 1.0 if necessary.) This should be called out as an incompatible change in the 9.5 release notes, although we'll also document it in the next set of back-branch release notes. The notes should mention that any views or rules that use citext's regexp_matches() functions will need to be dropped before upgrading to 1.1, and then recreated again afterwards. Back-patch to 9.1. The bug goes all the way back to citext's introduction in 8.4, but pre-9.1 there is no extension mechanism with which to manage the change. Given the lack of previous complaints it seems unnecessary to change this behavior in 9.0, anyway.	2015-05-05 15:51:22 -04:00
Peter Eisentraut	c0574cd5aa	hstore_plpython: Support tests on Python 2.3 Python 2.3 does not have the sorted() function, so do it the long way.	2015-05-04 22:30:21 -04:00
Andrew Dunstan	f802c6ddba	Enable transforms modules to build and run with Mingw builds. These modules were all missing essential Windows scaffolding, including resources files and descriptions, and links to the relevant library import files. This latter item means that the modules can't be built with pgxs on Windows, as we don't install the import files. If we ever decide to install them this restriction could probably be removed. Also, as with plperl we need to make sure that perl's CORE directory is last on the include list, as on Windows it appears to contain some headers with names that clash with names of some headers we include.	2015-05-03 09:10:47 -04:00
Peter Eisentraut	e30a864963	hstore_plperl: Move port-specific parts to later in the makefile PORTNAME isn't set until the global makefiles have been included.	2015-05-02 08:03:47 -04:00
Peter Eisentraut	0fd764647a	Make hstore_plperl's build even more like plperl's Combine the two places that set CPPFLAGS into one. Also, some settings should be restricted to Windows only. More precisely, -Wno-comment is a GCC-only option, but Windows in a makefile implies GCC at the moment. Also, since -Wno-comment is more properly a preprocessor option, move it to CPPFLAGS to simplify things a bit.	2015-05-01 22:16:58 -04:00
Andrew Dunstan	77477e745b	Make hstore_plperl's build more like plperl's This involves moving perl's CORE library to the end of the include list, and adding other compilation settings that plperl uses. This won't completely fix the breakage currently being seen by gcc builds on Windows, but it will let the build get further, and should be wholly benign, if not beneficial, on *nix.	2015-05-01 15:36:44 -04:00
Robert Haas	924bcf4f16	Create an infrastructure for parallel computation in PostgreSQL. This does four basic things. First, it provides convenience routines to coordinate the startup and shutdown of parallel workers. Second, it synchronizes various pieces of state (e.g. GUCs, combo CID mappings, transaction snapshot) from the parallel group leader to the worker processes. Third, it prohibits various operations that would result in unsafe changes to that state while parallelism is active. Finally, it propagates events that would result in an ErrorResponse, NoticeResponse, or NotifyResponse message being sent to the client from the parallel workers back to the master, from which they can then be sent on to the client. Robert Haas, Amit Kapila, Noah Misch, Rushabh Lathia, Jeevan Chalke. Suggestions and review from Andres Freund, Heikki Linnakangas, Noah Misch, Simon Riggs, Euler Taveira, and Jim Nasby.	2015-04-30 15:02:14 -04:00
Peter Eisentraut	dbf2ec1a1c	Fix parallel make risk with new check temp-install setup The "check" target no longer needs to depend on "all", because it now runs "install" directly, which in turn depends on "all". Doing both will cause problems with parallel make, because two builds will run next to each other. Also remove the redirection of the temp-install output into a log file. This was appropriate when this was done from within pg_regress, but now it's just a regular make run, and especially with the above changes this will now take the place of running the "all" target before the test suites. problem report by Jeff Janes, patch in part by Michael Paquier	2015-04-29 20:34:22 -04:00
Andres Freund	5aa2350426	Introduce replication progress tracking infrastructure. When implementing a replication solution ontop of logical decoding, two related problems exist: * How to safely keep track of replication progress * How to change replication behavior, based on the origin of a row; e.g. to avoid loops in bi-directional replication setups The solution to these problems, as implemented here, consist out of three parts: 1) 'replication origins', which identify nodes in a replication setup. 2) 'replication progress tracking', which remembers, for each replication origin, how far replay has progressed in a efficient and crash safe manner. 3) The ability to filter out changes performed on the behest of a replication origin during logical decoding; this allows complex replication topologies. E.g. by filtering all replayed changes out. Most of this could also be implemented in "userspace", e.g. by inserting additional rows contain origin information, but that ends up being much less efficient and more complicated. We don't want to require various replication solutions to reimplement logic for this independently. The infrastructure is intended to be generic enough to be reusable. This infrastructure also replaces the 'nodeid' infrastructure of commit timestamps. It is intended to provide all the former capabilities, except that there's only 2^16 different origins; but now they integrate with logical decoding. Additionally more functionality is accessible via SQL. Since the commit timestamp infrastructure has also been introduced in 9.5 (commit `73c986add`) changing the API is not a problem. For now the number of origins for which the replication progress can be tracked simultaneously is determined by the max_replication_slots GUC. That GUC is not a perfect match to configure this, but there doesn't seem to be sufficient reason to introduce a separate new one. Bumps both catversion and wal page magic. Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer Discussion: 20150216002155.GI15326@awork2.anarazel.de, 20140923182422.GA15776@alap3.anarazel.de, 20131114172632.GE7522@alap2.anarazel.de	2015-04-29 19:30:53 +02:00
Peter Eisentraut	f95425478a	Fix hstore_plperl regression tests on some platforms On some platforms, plperl and plperlu cannot be loaded at the same time. So split the test into two separate test files.	2015-04-26 16:13:58 -04:00
Peter Eisentraut	cac7658205	Add transforms feature This provides a mechanism for specifying conversions between SQL data types and procedural languages. As examples, there are transforms for hstore and ltree for PL/Perl and PL/Python. reviews by Pavel Stěhule and Andres Freund	2015-04-26 10:33:14 -04:00
Tom Lane	f320cbb615	Fix typo in linux startup script. Missed a "$" in what was meant to be a variable substitution. Careless mistake in commit `f23425fa95`.	2015-04-26 09:43:15 -04:00
Peter Eisentraut	dcae5facca	Improve speed of make check-world Before, make check-world would create a new temporary installation for each test suite, which is slow and wasteful. Instead, we now create one test installation that is used by all test suites that are part of a make run. The management of the temporary installation is removed from pg_regress and handled in the makefiles. This allows for better control, and unifies the code with that of test suites not run through pg_regress. review and msvc support by Michael Paquier <michael.paquier@gmail.com> more review by Fabien Coelho <coelho@cri.ensmp.fr>	2015-04-23 08:59:52 -04:00
Stephen Frost	4ccc5bd28e	Pull in tableoid for inheiritance with rowMarks As noted by Etsuro Fujita [1] and Dean Rasheed[2], `cb1ca4d800` changed ExecBuildAuxRowMark() to always look for the tableoid in the target list, but didn't also change preprocess_targetlist() to always include the tableoid. This resulted in errors with soon-to-be-added RLS with inheritance tests, and errors when using inheritance with foreign tables. Authors: Etsuro Fujita and Dean Rasheed (independently) Minor word-smithing on the comments by me. [1] 552CF0B6.8010006@lab.ntt.co.jp [2] CAEZATCVmFUfUOwwhnBTcgi6AquyjQ0-1fyKd0T3xBWJvn+xsFA@mail.gmail.com	2015-04-22 11:29:35 -04:00
Andres Freund	cef939c347	Rename pg_replication_slot's new active_in to active_pid. In `d811c037ce` active_in was added but discussion since showed that active_pid is preferred as a name. Discussion: CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com	2015-04-22 09:43:40 +02:00
Peter Eisentraut	b0a738f428	Move pg_xlogdump from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-21 19:03:49 -04:00
Andres Freund	d811c037ce	Add 'active_in' column to pg_replication_slots. Right now it is visible whether a replication slot is active in any session, but not in which. Adding the active_in column, containing the pid of the backend having acquired the slot, makes it much easier to associate pg_replication_slots entries with the corresponding pg_stat_replication/pg_stat_activity row. This should have been done from the start, but I (Andres) dropped the ball there somehow. Author: Craig Ringer, revised by me Discussion: CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com	2015-04-21 11:51:06 +02:00
Peter Eisentraut	528c2e44ab	Move pg_test_timing from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-20 21:30:12 -04:00
Peter Eisentraut	00882d9e5c	Move pg_test_fsync from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-19 22:20:49 -04:00
Peter Eisentraut	9fa8b0ee90	Move pg_upgrade from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-14 19:26:38 -04:00
Peter Eisentraut	30982be4e5	Integrate pg_upgrade_support module into backend Previously, these functions were created in a schema "binary_upgrade", which was deleted after pg_upgrade was finished. Because we don't want to keep that schema around permanently, move them to pg_catalog but rename them with a binary_upgrade_... prefix. The provided functions are only small wrappers around global variables that were added specifically for pg_upgrade use, so keeping the module separate does not create any modularity. The functions still check that they are only called in binary upgrade mode, so it is not possible to call these during normal operation. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-14 19:26:37 -04:00
Heikki Linnakangas	4f700bcd20	Reorganize our CRC source files again. Now that we use CRC-32C in WAL and the control file, the "traditional" and "legacy" CRC-32 variants are not used in any frontend programs anymore. Move the code for those back from src/common to src/backend/utils/hash. Also move the slicing-by-8 implementation (back) to src/port. This is in preparation for next patch that will add another implementation that uses Intel SSE 4.2 instructions to calculate CRC-32C, where available.	2015-04-14 17:03:42 +03:00
Peter Eisentraut	81134af3ec	Move pgbench from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-13 13:07:16 -04:00
Peter Eisentraut	83aca89f7c	Move pg_archivecleanup from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-11 23:29:18 -04:00
Alvaro Herrera	27846f02c1	Optimize locking a tuple already locked by another subxact Locking and updating the same tuple repeatedly led to some strange multixacts being created which had several subtransactions of the same parent transaction holding locks of the same strength. However, once a subxact of the current transaction holds a lock of a given strength, it's not necessary to acquire the same lock again. This made some coding patterns much slower than required. The fix is twofold. First we change HeapTupleSatisfiesUpdate to return HeapTupleBeingUpdated for the case where the current transaction is already a single-xid locker for the given tuple; it used to return HeapTupleMayBeUpdated for that case. The new logic is simpler, and the change to pgrowlocks is a testament to that: previously we needed to check for the single-xid locker separately in a very ugly way. That test is simpler now. As fallout from the HTSU change, some of its callers need to be amended so that tuple-locked-by-own-transaction is taken into account in the BeingUpdated case rather than the MayBeUpdated case. For many of them there is no difference; but heap_delete() and heap_update now check explicitely and do not grab tuple lock in that case. The HTSU change also means that routine MultiXactHasRunningRemoteMembers introduced in commit `11ac4c73cb` is no longer necessary and can be removed; the case that used to require it is now handled naturally as result of the changes to heap_delete and heap_update. The second part of the fix to the performance issue is to adjust heap_lock_tuple to avoid the slowness: 1. Previously we checked for the case that our own transaction already held a strong enough lock and returned MayBeUpdated, but only in the multixact case. Now we do it for the plain Xid case as well, which saves having to LockTuple. 2. If the current transaction is the only locker of the tuple (but with a lock not as strong as what we need; otherwise it would have been caught in the check mentioned above), we can skip sleeping on the multixact, and instead go straight to create an updated multixact with the additional lock strength. 3. Most importantly, make sure that both the single-xid-locker case and the multixact-locker case optimization are applied always. We do this by checking both in a single place, rather than them appearing in two separate portions of the routine -- something that is made possible by the HeapTupleSatisfiesUpdate API change. Previously we would only check for the single-xid case when HTSU returned MayBeUpdated, and only checked for the multixact case when HTSU returned BeingUpdated. This was at odds with what HTSU actually returned in one case: if our own transaction was locker in a multixact, it returned MayBeUpdated, so the optimization never applied. This is what led to the large multixacts in the first place. Per bug report #8470 by Oskari Saarenmaa.	2015-04-10 13:47:15 -03:00
Robert Haas	e41beea0dd	Improve pgbench error reporting. This would have been worth doing on general principle anyway, but the recent addition of an expression syntax to pgbench makes it an even better idea than it would have been otherwise. Fabien Coelho	2015-04-02 16:26:49 -04:00
Andres Freund	62e2a8dc2c	Define integer limits independently from the system definitions. In `83ff1618` we defined integer limits iff they're not provided by the system. That turns out not to be the greatest idea because there's different ways some datatypes can be represented. E.g. on OSX PG's 64bit datatype will be a 'long int', but OSX unconditionally uses 'long long'. That disparity then can lead to warnings, e.g. around printf formats. One way to fix that would be to back int64 using stdint.h's int64_t. While a good idea it's not that easy to implement. We would e.g. need to include stdint.h in our external headers, which we don't today. Also computing the correct int64 printf formats in that case is nontrivial. Instead simply prefix the integer limits with PG_ and define them unconditionally. I've adjusted all the references to them in code, but not the ones in comments; the latter seems unnecessary to me. Discussion: 20150331141423.GK4878@alap3.anarazel.de	2015-04-02 17:43:35 +02:00
Bruce Momjian	a0efc71453	pg_upgrade: call 'postgres' binary to get data directory location This matches the binary 'pg_ctl' calls. Previously we called the 'postmaster'. Report by Christoph Berg	2015-04-01 18:25:45 -04:00
Bruce Momjian	0cf16b44cb	btree_gin: properly call DirectFunctionCall1() Previously we called DirectFunctionCall3() with dummy arguments. Fixed version of previous patch. Report by Jon Nelson	2015-03-31 10:26:45 -04:00
Heikki Linnakangas	1d0db8de04	Remove spurious semicolons. Petr Jelinek	2015-03-31 15:12:27 +03:00
Andrew Dunstan	fa1e5afa8a	Run pg_upgrade and pg_resetxlog with restricted token on Windows As with initdb these programs need to run with a restricted token, and if they don't pg_upgrade will fail when run as a user with Adminstrator privileges. Backpatch to all live branches. On the development branch the code is reorganized so that the restricted token code is now in a single location. On the stable bramches a less invasive change is made by simply copying the relevant code to pg_upgrade.c and pg_resetxlog.c. Patches and bug report from Muhammad Asif Naeem, reviewed by Michael Paquier, slightly edited by me.	2015-03-30 17:07:52 -04:00
Tom Lane	542320c2bd	Be more careful about printing constants in ruleutils.c. The previous coding in get_const_expr() tried to avoid quoting integer, float, and numeric literals if at all possible. While that looks nice, it means that dumped expressions might re-parse to something that's semantically equivalent but not the exact same parsetree; for example a FLOAT8 constant would re-parse as a NUMERIC constant with a cast to FLOAT8. Though the result would be the same after constant-folding, this is problematic in certain contexts. In particular, Jeff Davis pointed out that this could cause unexpected failures in ALTER INHERIT operations because of child tables having not-exactly-equivalent CHECK expressions. Therefore, favor correctness over legibility and dump such constants in quotes except in the limited cases where they'll be interpreted as the same type even without any casting. This results in assorted small changes in the regression test outputs, and will affect display of user-defined views and rules similarly. The odds of that causing problems in the field seem non-negligible; given the lack of previous complaints, it seems best not to change this in the back branches.	2015-03-30 14:59:49 -04:00
Tom Lane	e9dd03c03a	Minor code cleanups in pgbench expression support. Get rid of unnecessary expr_yylex declaration (we haven't supported flex 2.5.4 in a long time, and even if we still did, the declaration in pgbench.h makes this one unnecessary and inappropriate). Fix copyright dates, improve some layout choices, etc.	2015-03-29 13:06:59 -04:00
Tom Lane	2c33e0fbce	Better fix for misuse of Float8GetDatumFast(). We can use that macro as long as we put the value into a local variable. Commit `735cd6128` was not wrong on its own terms, but I think this way looks nicer, and it should save a few cycles on 32-bit machines.	2015-03-28 13:56:37 -04:00
Andrew Dunstan	cfe12763c3	Use standard librart sqrt function in pg_stat_statements The stddev calculation included a faster but unportable sqrt function. This is not worth the extra effort, and won't work everywhere. If the standard library function is good enough for the SQL function it should be good enough here too.	2015-03-28 09:22:51 -04:00
Heikki Linnakangas	e09b48316c	Add index-only scan support to btree_gist. inet, cidr, and timetz indexes still cannot support index-only scans, because they don't store the original unmodified value in the index, but a derived approximate value.	2015-03-27 23:35:16 +02:00
Andrew Dunstan	735cd6128a	Fix portability issues with stddev in pg_stat_statements Stddev is calculated on the fly, and the code in commit `717f709532` was using Float8GetDatumFast() inappropriately to convert the result to a Datum. Mea culpa. It now uses Float8GetDatum().	2015-03-27 17:29:59 -04:00
Andrew Dunstan	717f709532	Add stats for min, max, mean, stddev times to pg_stat_statements. The new fields are min_time, max_time, mean_time and stddev_time. Based on an original patch from Mitsumasa KONDO, modified by me. Reviewed by Petr Jelínek.	2015-03-27 15:43:22 -04:00
Heikki Linnakangas	8816af65e4	Minor refactoring of btree_gist code. The gbt_var_key_copy function was doing two different things depending on the boolean argument. Seems cleaner to have two separate functions. Remove unused argument from gbt_num_compress.	2015-03-26 23:10:10 +02:00
Tom Lane	785941cdc3	Tweak __attribute__-wrapping macros for better pgindent results. This improves on commit `bbfd7edae5` by making two simple changes: * pg_attribute_noreturn now takes parentheses, ie pg_attribute_noreturn(). Likewise pg_attribute_unused(), pg_attribute_packed(). This reduces pgindent's tendency to misformat declarations involving them. * attributes are now always attached to function declarations, not definitions. Previously some places were taking creative shortcuts, which were not merely candidates for bad misformatting by pgindent but often were outright wrong anyway. (It does little good to put a noreturn annotation where callers can't see it.) In any case, if we would like to believe that these macros can be used with non-gcc compilers, we should avoid gratuitous variance in usage patterns. I also went through and manually improved the formatting of a lot of declarations, and got rid of excessively repetitive (and now obsolete anyway) comments informing the reader what pg_attribute_printf is for.	2015-03-26 14:03:25 -04:00
Andres Freund	83ff1618bc	Centralize definition of integer limits. Several submitted and even committed patches have run into the problem that C89, our baseline, does not provide minimum/maximum values for various integer datatypes. C99's stdint.h does, but we can't rely on it. Several parts of the code defined limits locally, so instead centralize the definitions to c.h. This patch also changes the more obvious usages of literal limit values; there's more places that could be changed, but it's less clear whether it's beneficial to change those. Author: Andrew Gierth Discussion: 87619tc5wc.fsf@news-spur.riddles.org.uk	2015-03-25 22:39:42 +01:00
Bruce Momjian	11226e3817	Revert commit `843cd0bfe6` Report by Tom Lane	2015-03-24 22:35:05 -04:00
Bruce Momjian	843cd0bfe6	btree_gin: properly call DirectFunctionCall1() Previously we called DirectFunctionCall3() with dummy arguments. Patch by Jon Nelson	2015-03-24 20:53:29 -04:00
Tom Lane	cb1ca4d800	Allow foreign tables to participate in inheritance. Foreign tables can now be inheritance children, or parents. Much of the system was already ready for this, but we had to fix a few things of course, mostly in the area of planner and executor handling of row locks. As side effects of this, allow foreign tables to have NOT VALID CHECK constraints (and hence to accept ALTER ... VALIDATE CONSTRAINT), and to accept ALTER SET STORAGE and ALTER SET WITH/WITHOUT OIDS. Continuing to disallow these things would've required bizarre and inconsistent special cases in inheritance behavior. Since foreign tables don't enforce CHECK constraints anyway, a NOT VALID one is a complete no-op, but that doesn't mean we shouldn't allow it. And it's possible that some FDWs might have use for SET STORAGE or SET WITH OIDS, though doubtless they will be no-ops for most. An additional change in support of this is that when a ModifyTable node has multiple target tables, they will all now be explicitly identified in EXPLAIN output, for example: Update on pt1 (cost=0.00..321.05 rows=3541 width=46) Update on pt1 Foreign Update on ft1 Foreign Update on ft2 Update on child3 -> Seq Scan on pt1 (cost=0.00..0.00 rows=1 width=46) -> Foreign Scan on ft1 (cost=100.00..148.03 rows=1170 width=46) -> Foreign Scan on ft2 (cost=100.00..148.03 rows=1170 width=46) -> Seq Scan on child3 (cost=0.00..25.00 rows=1200 width=46) This was done mainly to provide an unambiguous place to attach "Remote SQL" fields, but it is useful for inherited updates even when no foreign tables are involved. Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro Horiguchi, some additional hacking by me	2015-03-22 13:53:21 -04:00
Tom Lane	8d1f239003	Replace insertion sort in contrib/intarray with qsort(). It's all very well to claim that a simplistic sort is fast in easy cases, but O(N^2) in the worst case is not good ... especially if the worst case is as easy to hit as "descending order input". Replace that bit with our standard qsort. Per bug #12866 from Maksym Boguk. Back-patch to all active branches.	2015-03-15 23:22:03 -04:00
Tom Lane	7b8b8a4331	Improve representation of PlanRowMark. This patch fixes two inadequacies of the PlanRowMark representation. First, that the original LockingClauseStrength isn't stored (and cannot be inferred for foreign tables, which always get ROW_MARK_COPY). Since some PlanRowMarks are created out of whole cloth and don't actually have an ancestral RowMarkClause, this requires adding a dummy LCS_NONE value to enum LockingClauseStrength, which is fairly annoying but the alternatives seem worse. This fix allows getting rid of the use of get_parse_rowmark() in FDWs (as per the discussion around commits `462bd95705` and `8ec8760fc8`), and it simplifies some things elsewhere. Second, that the representation assumed that all child tables in an inheritance hierarchy would use the same RowMarkType. That's true today but will soon not be true. We add an "allMarkTypes" field that identifies the union of mark types used in all a parent table's children, and use that where appropriate (currently, only in preprocess_targetlist()). In passing fix a couple of minor infelicities left over from the SKIP LOCKED patch, notably that _outPlanRowMark still thought waitPolicy is a bool. Catversion bump is required because the numeric values of enum LockingClauseStrength can appear in on-disk rules. Extracted from a much larger patch to support foreign table inheritance; it seemed worth breaking this out, since it's a separable concern. Shigeru Hanada and Etsuro Fujita, somewhat modified by me	2015-03-15 18:41:47 -04:00
Robert Haas	e96b7c6b9f	sepgsql: Improve error message when unsupported object type is labeled. KaiGai Kohei, reviewed by Álvaro Herrera and myself	2015-03-11 12:12:10 -04:00
Andres Freund	bbfd7edae5	Add macros wrapping all usage of gcc's __attribute__. Until now __attribute__() was defined to be empty for all compilers but gcc. That's problematic because it prevents using it in other compilers; which is necessary e.g. for atomics portability. It's also just generally dubious to do so in a header as widely included as c.h. Instead add pg_attribute_format_arg, pg_attribute_printf, pg_attribute_noreturn macros which are implemented in the compilers that understand them. Also add pg_attribute_noreturn and pg_attribute_packed, but don't provide fallbacks, since they can affect functionality. This means that external code that, possibly unwittingly, relied on __attribute__ defined to be empty on !gcc compilers may now run into warnings or errors on those compilers. But there shouldn't be many occurances of that and it's hard to work around... Discussion: 54B58BA3.8040302@ohmu.fi Author: Oskari Saarenmaa, with some minor changes by me.	2015-03-11 14:30:01 +01:00
Fujii Masao	57aa5b2bb1	Add GUC to enable compression of full page images stored in WAL. When newly-added GUC parameter, wal_compression, is on, the PostgreSQL server compresses a full page image written to WAL when full_page_writes is on or during a base backup. A compressed page image will be decompressed during WAL replay. Turning this parameter on can reduce the WAL volume without increasing the risk of unrecoverable data corruption, but at the cost of some extra CPU spent on the compression during WAL logging and on the decompression during WAL replay. This commit changes the WAL format (so bumping WAL version number) so that the one-byte flag indicating whether a full page image is compressed or not is included in its header information. This means that the commit increases the WAL volume one-byte per a full page image even if WAL compression is not used at all. We can save that one-byte by borrowing one-bit from the existing field like hole_offset in the header and using it as the flag, for example. But which would reduce the code readability and the extensibility of the feature. Per discussion, it's not worth paying those prices to save only one-byte, so we decided to add the one-byte flag to the header. This commit doesn't introduce any new compression algorithm like lz4. Currently a full page image is compressed using the existing PGLZ algorithm. Per discussion, we decided to use it at least in the first version of the feature because there were no performance reports showing that its compression ratio is unacceptably lower than that of other algorithm. Of course, in the future, it's worth considering the support of other compression algorithm for the better compression. Rahila Syed and Michael Paquier, reviewed in various versions by myself, Andres Freund, Robert Haas, Abhijit Menon-Sen and many others.	2015-03-11 15:52:24 +09:00
Alvaro Herrera	e491bd2ee3	Move BRIN page type to page's last two bytes ... which is the usual convention among AMs, so that pg_filedump and similar utilities can tell apart pages of different AMs. It was also the intent of the original code, but I failed to realize that alignment considerations would move the whole thing to the previous-to-last word in the page. The new definition of the associated macro makes surrounding code a bit leaner, too. Per note from Heikki at http://www.postgresql.org/message-id/546A16EF.9070005@vmware.com	2015-03-10 12:27:15 -03:00
Heikki Linnakangas	f1fd515b39	Move WAL-related definitions from dbcommands.h to separate header file. This makes it easier to write frontend programs that needs to understand the WAL record format of CREATE/DROP DATABASE. dbcommands.h cannot easily be #included in a frontend program, because it pulls in other header files that need backend stuff, but the new dbcommands_xlog.h header file has fewer dependencies.	2015-03-09 15:50:49 +02:00
Alvaro Herrera	c6ee39bc85	Fix contrib/file_fdw's expected file I forgot to update it on yesterday's `cf34e373fc`.	2015-03-06 11:47:09 -03:00
Robert Haas	e5f3690249	pgbench: Fix mistakes in Makefile. My commit `878fdcb843` was not quite right. Tom Lane pointed out one of the mistakes fixed here, and I noticed the other myself while reviewing what I'd committed.	2015-03-03 10:48:16 -05:00
Robert Haas	878fdcb843	pgbench: Add a real expression syntax to \set Previously, you could do \set variable operand1 operator operand2, but nothing more complicated. Now, you can \set variable expression, which makes it much simpler to do multi-step calculations here. This also adds support for the modulo operator (%), with the same semantics as in C. Robert Haas and Fabien Coelho, reviewed by Álvaro Herrera and Stephen Frost	2015-03-02 14:21:41 -05:00
Tom Lane	2e211211a7	Use FLEXIBLE_ARRAY_MEMBER in a number of other places. I think we're about done with this...	2015-02-21 16:12:14 -05:00
Tom Lane	e1a11d9311	Use FLEXIBLE_ARRAY_MEMBER for HeapTupleHeaderData.t_bits[]. This requires changing quite a few places that were depending on sizeof(HeapTupleHeaderData), but it seems for the best. Michael Paquier, some adjustments by me	2015-02-21 15:13:06 -05:00
Tom Lane	c110eff132	Use FLEXIBLE_ARRAY_MEMBER in struct RecordIOData. I (tgl) fixed this last night in rowtypes.c, but I missed that the code had been copied into a couple of other places. Michael Paquier	2015-02-20 17:03:12 -05:00
Tom Lane	09d8d110a6	Use FLEXIBLE_ARRAY_MEMBER in a bunch more places. Replace some bogus "x[1]" declarations with "x[FLEXIBLE_ARRAY_MEMBER]". Aside from being more self-documenting, this should help prevent bogus warnings from static code analyzers and perhaps compiler misoptimizations. This patch is just a down payment on eliminating the whole problem, but it gets rid of a lot of easy-to-fix cases. Note that the main problem with doing this is that one must no longer rely on computing sizeof(the containing struct), since the result would be compiler-dependent. Instead use offsetof(struct, lastfield). Autoconf also warns against spelling that offsetof(struct, lastfield[0]). Michael Paquier, review and additional fixes by me.	2015-02-20 00:11:42 -05:00
Kevin Grittner	c923e82a23	Eliminate unnecessary NULL checks in picksplit method of intarray. Where these checks were being done there was no code path which could leave them NULL. Michael Paquier per Coverity	2015-02-16 15:26:23 -06:00
Tom Lane	80986e85aa	Avoid returning undefined bytes in chkpass_in(). We can't really fix the problem that the result is defined to depend on random(), so it is still going to fail the "unstable input conversion" test in parse_type.c. However, we can at least satify valgrind. (It looks like this code used to be valgrind-clean, actually, until somebody did a careless s/strncpy/strlcpy/g on it.) In passing, let's just make real sure that chkpass_out doesn't overrun its output buffer. No need for backpatch, I think, since this is just to satisfy debugging tools. Asif Naeem	2015-02-14 12:20:56 -05:00
Bruce Momjian	dc01efa5cc	pg_upgrade: improve checksum mismatch error message Patch by Greg Sabino Mullane, slight adjustments by me	2015-02-11 22:22:26 -05:00
Bruce Momjian	056764b102	pg_upgrade: quote directory names in delete_old_cluster script This allows the delete script to properly function when special characters appear in directory paths, e.g. spaces. Backpatch through 9.0	2015-02-11 22:06:04 -05:00
Heikki Linnakangas	c619c2351f	Move pg_crc.c to src/common, and remove pg_crc_tables.h To get CRC functionality in a client program, you now need to link with libpgcommon instead of libpgport. The CRC code has nothing to do with portability, so libpgcommon is a better home. (libpgcommon didn't exist when pg_crc.c was originally moved to src/port.) Remove the possibility to get CRC functionality by just #including pg_crc_tables.h. I'm not aware of any extensions that actually did that and couldn't simply link with libpgcommon. This also moves the pg_crc.h header file from src/include/utils to src/include/common, which will require changes to any external programs that currently does #include "utils/pg_crc.h". That seems acceptable, as include/common is clearly the right home for it now, and the change needed to any such programs is trivial.	2015-02-09 11:17:56 +02:00
Robert Haas	370b3a4618	pgcrypto: Code cleanup for decrypt_internal. Remove some unnecessary null-tests, and replace a goto-label construct with an "if" block. Michael Paquier, reviewed by me.	2015-02-04 08:46:32 -05:00
Heikki Linnakangas	4eaafa0453	Remove dead code. Commit `13629df` changed metaphone() function to return an empty string on empty input, but it left the old error message in place. It's now dead code. Michael Paquier, per Coverity warning.	2015-02-03 09:43:44 +02:00
Noah Misch	59b919822a	Prevent Valgrind Memcheck errors around px_acquire_system_randomness(). This function uses uninitialized stack and heap buffers as supplementary entropy sources. Mark them so Memcheck will not complain. Back-patch to 9.4, where Valgrind Memcheck cooperation first appeared. Marko Tiikkaja	2015-02-02 10:00:45 -05:00
Noah Misch	8b59672d8d	Cherry-pick security-relevant fixes from upstream imath library. This covers alterations to buffer sizing and zeroing made between imath 1.3 and imath 1.20. Valgrind Memcheck identified the buffer overruns and reliance on uninitialized data; their exploit potential is unknown. Builds specifying --with-openssl are unaffected, because they use the OpenSSL BIGNUM facility instead of imath. Back-patch to 9.0 (all supported versions). Security: CVE-2015-0243	2015-02-02 10:00:45 -05:00
Noah Misch	1dc7551586	Fix buffer overrun after incomplete read in pullf_read_max(). Most callers pass a stack buffer. The ensuing stack smash can crash the server, and we have not ruled out the viability of attacks that lead to privilege escalation. Back-patch to 9.0 (all supported versions). Marko Tiikkaja Security: CVE-2015-0243	2015-02-02 10:00:45 -05:00
Tom Lane	a59ee88197	Fix Coverity warning about contrib/pgcrypto's mdc_finish(). Coverity points out that mdc_finish returns a pointer to a local buffer (which of course is gone as soon as the function returns), leaving open a risk of misbehaviors possibly as bad as a stack overwrite. In reality, the only possible call site is in process_data_packets() which does not examine the returned pointer at all. So there's no live bug, but nonetheless the code is confusing and risky. Refactor to avoid the issue by letting process_data_packets() call mdc_finish() directly instead of going through the pullf_read() API. Although this is only cosmetic, it seems good to back-patch so that the logic in pgp-decrypt.c stays in sync across all branches. Marko Kreen	2015-01-30 13:05:30 -05:00
Tom Lane	37507962c3	Handle unexpected query results, especially NULLs, safely in connectby(). connectby() didn't adequately check that the constructed SQL query returns what it's expected to; in fact, since commit `08c33c426b` it wasn't checking that at all. This could result in a null-pointer-dereference crash if the constructed query returns only one column instead of the expected two. Less excitingly, it could also result in surprising data conversion failures if the constructed query returned values that were not I/O-conversion-compatible with the types specified by the query calling connectby(). In all branches, insist that the query return at least two columns; this seems like a minimal sanity check that can't break any reasonable use-cases. In HEAD, insist that the constructed query return the types specified by the outer query, including checking for typmod incompatibility, which the code never did even before it got broken. This is to hide the fact that the implementation does a conversion to text and back; someday we might want to improve that. In back branches, leave that alone, since adding a type check in a minor release is more likely to break things than make people happy. Type inconsistencies will continue to work so long as the actual type and declared type are I/O representation compatible, and otherwise will fail the same way they used to. Also, in all branches, be on guard for NULL results from the constructed query, which formerly would cause null-pointer dereference crashes. We now print the row with the NULL but don't recurse down from it. In passing, get rid of the rather pointless idea that build_tuplestore_recursively() should return the same tuplestore that's passed to it. Michael Paquier, adjusted somewhat by me	2015-01-29 20:18:33 -05:00
Andres Freund	ed127002d8	Align buffer descriptors to cache line boundaries. Benchmarks has shown that aligning the buffer descriptor array to cache lines is important for scalability; especially on bigger, multi-socket, machines. Currently the array sometimes already happens to be aligned by happenstance, depending how large previous shared memory allocations were. That can lead to wildly varying performance results after minor configuration changes. In addition to aligning the start of descriptor array, also force the size of individual descriptors to be of a common cache line size (64 bytes). That happens to already be the case on 64bit platforms, but this way we can change the struct BufferDesc more easily. As the alignment primarily matters in highly concurrent workloads which probably all are 64bit these days, and the space wastage of element alignment would be a bit more noticeable on 32bit systems, we don't force the stride to be cacheline sized on 32bit platforms for now. If somebody does actual performance testing, we can reevaluate that decision by changing the definition of BUFFERDESC_PADDED_SIZE. Discussion: 20140202151319.GD32123@awork2.anarazel.de Per discussion with Bruce Momjan, Tom Lane, Robert Haas, and Peter Geoghegan.	2015-01-29 22:48:45 +01:00
Heikki Linnakangas	670bf71f65	Remove dead NULL-pointer checks in GiST code. gist_poly_compress() and gist_circle_compress() checked for a NULL-pointer key argument, but that was dead code; the gist code never passes a NULL-pointer to the "compress" method. This commit also removes a documentation note added in commit `a0a3883`, about doing NULL-pointer checks in the "compress" method. It was added based on the fact that some implementations were doing NULL-pointer checks, but those checks were unnecessary in the first place. The NULL-pointer check in gbt_var_same() function was also unnecessary. The arguments to the "same" method come from the "compress", "union", or "picksplit" methods, but none of them return a NULL pointer. None of this is to be confused with SQL NULL values. Those are dealt with by the gist machinery, and are never passed to the GiST opclass methods. Michael Paquier	2015-01-28 10:03:58 +02:00
Tom Lane	dabda64152	Fix volatile-safety issue in dblink's materializeQueryResult(). Some fields of the sinfo struct are modified within PG_TRY and then referenced within PG_CATCH, so as with recent patch to async.c, "volatile" is necessary for strict POSIX compliance; and that propagates to a couple of subroutines as well as materializeQueryResult() itself. I think the risk of actual issues here is probably higher than in async.c, because storeQueryResult() is likely to get inlined into materializeQueryResult(), leaving the compiler free to conclude that its stores into sinfo fields are dead code.	2015-01-26 15:17:33 -05:00
Tom Lane	586dd5d6a5	Replace a bunch more uses of strncpy() with safer coding. strncpy() has a well-deserved reputation for being unsafe, so make an effort to get rid of nearly all occurrences in HEAD. A large fraction of the remaining uses were passing length less than or equal to the known strlen() of the source, in which case no null-padding can occur and the behavior is equivalent to memcpy(), though doubtless slower and certainly harder to reason about. So just use memcpy() in these cases. In other cases, use either StrNCpy() or strlcpy() as appropriate (depending on whether padding to the full length of the destination buffer seems useful). I left a few strncpy() calls alone in the src/timezone/ code, to keep it in sync with upstream (the IANA tzcode distribution). There are also a few such calls in ecpg that could possibly do with more analysis. AFAICT, none of these changes are more than cosmetic, except for the four occurrences in fe-secure-openssl.c, which are in fact buggy: an overlength source leads to a non-null-terminated destination buffer and ensuing misbehavior. These don't seem like security issues, first because no stack clobber is possible and second because if your values of sslcert etc are coming from untrusted sources then you've got problems way worse than this. Still, it's undesirable to have unpredictable behavior for overlength inputs, so back-patch those four changes to all active branches.	2015-01-24 13:05:42 -05:00
Tom Lane	eb213acfe2	Prevent duplicate escape-string warnings when using pg_stat_statements. contrib/pg_stat_statements will sometimes run the core lexer a second time on submitted statements. Formerly, if you had standard_conforming_strings turned off, this led to sometimes getting two copies of any warnings enabled by escape_string_warning. While this is probably no longer a big deal in the field, it's a pain for regression testing. To fix, change the lexer so it doesn't consult the escape_string_warning GUC variable directly, but looks at a copy in the core_yy_extra_type state struct. Then, pg_stat_statements can change that copy to disable warnings while it's redoing the lexing. It seemed like a good idea to make this happen for all three of the GUCs consulted by the lexer, not just escape_string_warning. There's not an immediate use-case for callers to adjust the other two AFAIK, but making it possible is easy enough and seems like good future-proofing. Arguably this is a bug fix, but there doesn't seem to be enough interest to justify a back-patch. We'd not be able to back-patch exactly as-is anyway, for fear of breaking ABI compatibility of the struct. (We could perhaps back-patch the addition of only escape_string_warning by adding it at the end of the struct, where there's currently alignment padding space.)	2015-01-22 18:11:00 -05:00
Tom Lane	8e166e164c	Rearrange explain.c's API so callers need not embed sizeof(ExplainState). The folly of the previous arrangement was just demonstrated: there's no convenient way to add fields to ExplainState without breaking ABI, even if callers have no need to touch those fields. Since we might well need to do that again someday in back branches, let's change things so that only explain.c has to have sizeof(ExplainState) compiled into it. This costs one extra palloc() per EXPLAIN operation, which is surely pretty negligible.	2015-01-15 13:39:33 -05:00
Robert Haas	0b49642b99	pg_standby: Avoid writing one byte beyond the end of the buffer. Previously, read() might have returned a length equal to the buffer length, and then the subsequent store to buf[len] would write a zero-byte one byte past the end. This doesn't seem likely to be a security issue, but there's some chance it could result in pg_standby misbehaving. Spotted by Coverity; patch by Michael Paquier, reviewed by me.	2015-01-15 09:26:03 -05:00
Robert Haas	4a0a5f21fa	vacuumlo: Avoid unlikely memory leak. Spotted by Coverity. This isn't likely to matter in practice, but there's no harm in fixing it. Michael Paquier	2015-01-14 15:14:20 -05:00
Heikki Linnakangas	e37d474f91	Silence Coverity warnings about unused return values from pushJsonbValue() Similar warnings from backend were silenced earlier by commit `c8315930`, but there were a few more contrib/hstore. Michael Paquier	2015-01-13 14:33:05 +02:00
Bruce Momjian	ac7009abd2	pg_upgrade: fix one-byte per empty db memory leak Report by Tatsuo Ishii, Coverity	2015-01-09 12:12:30 -05:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Andres Freund	8cadeb792c	Correctly handle test durations of more than 2147s in pg_test_timing. Previously the computation of the total test duration, measured in microseconds, accidentally overflowed due to accidentally using signed 32bit arithmetic. As the only consequence is that pg_test_timing invocations with such, overly large, durations never finished the practical consequences of this bug are minor. Pointed out by Coverity. Backpatch to 9.2 where pg_test_timing was added.	2015-01-04 15:44:49 +01:00
Andres Freund	d1c575230d	Fix off-by-one in pg_xlogdump's fuzzy_open_file(). In the unlikely case of stdin (fd 0) being closed, the off-by-one would lead to pg_xlogdump failing to open files. Spotted by Coverity. Backpatch to 9.3 where pg_xlogdump was introduced.	2015-01-04 15:35:46 +01:00
Andres Freund	58bc4747be	Add missing va_end() call to a early exit in dmetaphone.c's StringAt(). Pointed out by Coverity. Backpatch to all supported branches, the code has been that way for a long while.	2015-01-04 15:35:46 +01:00
Tatsuo Ishii	3b5a89c482	Fix resource leak pointed out by Coverity.	2014-12-30 20:33:01 +09:00
Bruce Momjian	83bcc70459	pgbench: remove odd trailing period in init progress output	2014-12-24 09:21:09 -05:00
Heikki Linnakangas	7f0dccaed6	Turn much of the btree_gin macros into real functions. This makes the functions much nicer to read and edit, and also makes debugging easier.	2014-12-22 17:11:53 +02:00
Tom Lane	4a14f13a0a	Improve hash_create's API for selecting simple-binary-key hash functions. Previously, if you wanted anything besides C-string hash keys, you had to specify a custom hashing function to hash_create(). Nearly all such callers were specifying tag_hash or oid_hash; which is tedious, and rather error-prone, since a caller could easily miss the opportunity to optimize by using hash_uint32 when appropriate. Replace this with a design whereby callers using simple binary-data keys just specify HASH_BLOBS and don't need to mess with specific support functions. hash_create() itself will take care of optimizing when the key size is four bytes. This nets out saving a few hundred bytes of code space, and offers a measurable performance improvement in tidbitmap.c (which was not exploiting the opportunity to use hash_uint32 for its 4-byte keys). There might be some wins elsewhere too, I didn't analyze closely. In future we could look into offering a similar optimized hashing function for 8-byte keys. Under this design that could be done in a centralized and machine-independent fashion, whereas getting it right for keys of platform-dependent sizes would've been notationally painful before. For the moment, the old way still works fine, so as not to break source code compatibility for loadable modules. Eventually we might want to remove tag_hash and friends from the exported API altogether, since there's no real need for them to be explicitly referenced from outside dynahash.c. Teodor Sigaev and Tom Lane	2014-12-18 13:36:36 -05:00
Noah Misch	f6dc6dd5ba	Lock down regression testing temporary clusters on Windows. Use SSPI authentication to allow connections exclusively from the OS user that launched the test suite. This closes on Windows the vulnerability that commit `be76a6d39e` closed on other platforms. Users of "make installcheck" or custom test harnesses can run "pg_regress --config-auth=DATADIR" to activate the same authentication configuration that "make check" would use. Back-patch to 9.0 (all supported versions). Security: CVE-2014-0067	2014-12-17 22:48:40 -05:00
Tom Lane	fc2ac1fb41	Allow CHECK constraints to be placed on foreign tables. As with NOT NULL constraints, we consider that such constraints are merely reports of constraints that are being enforced by the remote server (or other underlying storage mechanism). Their only real use is to allow planner optimizations, for example in constraint-exclusion checks. Thus, the code changes here amount to little more than removal of the error that was formerly thrown for applying CHECK to a foreign table. (In passing, do a bit of cleanup of the ALTER FOREIGN TABLE reference page, which had accumulated some weird decisions about ordering etc.) Shigeru Hanada and Etsuro Fujita, reviewed by Kyotaro Horiguchi and Ashutosh Bapat.	2014-12-17 17:00:53 -05:00
Magnus Hagander	cef0ae498c	Update .gitignore for pg_upgrade Add Windows versions of generated scripts, and make sure we only ignore the scripts int he root directory. Michael Paquier	2014-12-17 11:55:22 +01:00
Tom Lane	de8e46f5f5	Suppress bogus statistics when pgbench failed to complete any transactions. Code added in 9.4 would attempt to divide by zero in such cases. Noted while testing fix for missing-pclose problem.	2014-12-16 14:53:55 -05:00
Tom Lane	d38e8d30ce	Fix file descriptor leak after failure of a \setshell command in pgbench. If the called command fails to return data, runShellCommand forgot to pclose() the pipe before returning. This is fairly harmless in the current code, because pgbench would then abandon further processing of that client thread; so no more than nclients descriptors could be leaked this way. But it's not hard to imagine future improvements whereby that wouldn't be true. In any case, it's sloppy coding, so patch all branches. Found by Coverity.	2014-12-16 13:31:42 -05:00
Tom Lane	8ec8760fc8	Revert misguided change to postgres_fdw FOR UPDATE/SHARE code. In commit `462bd95705`, I changed postgres_fdw to rely on get_plan_rowmark() instead of get_parse_rowmark(). I still think that's a good idea in the long run, but as Etsuro Fujita pointed out, it doesn't work today because planner.c forces PlanRowMarks to have markType = ROW_MARK_COPY for all foreign tables. There's no urgent reason to change this in the back branches, so let's just revert that part of yesterday's commit rather than trying to design a better solution under time pressure. Also, add a regression test case showing what postgres_fdw does with FOR UPDATE/SHARE. I'd blithely assumed there was one already, else I'd have realized yesterday that this code didn't work.	2014-12-12 12:41:49 -05:00
Tom Lane	462bd95705	Fix planning of SELECT FOR UPDATE on child table with partial index. Ordinarily we can omit checking of a WHERE condition that matches a partial index's condition, when we are using an indexscan on that partial index. However, in SELECT FOR UPDATE we must include the "redundant" filter condition in the plan so that it gets checked properly in an EvalPlanQual recheck. The planner got this mostly right, but improperly omitted the filter condition if the index in question was on an inheritance child table. In READ COMMITTED mode, this could result in incorrectly returning just-updated rows that no longer satisfy the filter condition. The cause of the error is using get_parse_rowmark() when get_plan_rowmark() is what should be used during planning. In 9.3 and up, also fix the same mistake in contrib/postgres_fdw. It's currently harmless there (for lack of inheritance support) but wrong is wrong, and the incorrect code might get copied to someplace where it's more significant. Report and fix by Kyotaro Horiguchi. Back-patch to all supported branches.	2014-12-11 21:02:25 -05:00
Alvaro Herrera	dcbfc00aba	pg_xlogdump/.gitignore: add committsdesc.c Author: Michael Paquier	2014-12-09 09:54:14 -03:00
Heikki Linnakangas	ebc2b681b8	Fix pg_xlogdump's calculation of full-page image data. The old formula was completely bogus with the new WAL record format.	2014-12-05 11:40:27 +02:00
Peter Eisentraut	1e95bbc870	Fix SHLIB_PREREQS use in contrib, allowing PGXS builds dblink and postgres_fdw use SHLIB_PREREQS = submake-libpq to build libpq first. This doesn't work in a PGXS build, because there is no libpq to build. So just omit setting SHLIB_PREREQS in this case. Note that PGXS users can still use SHLIB_PREREQS (although it is not documented). The problem here is only that contrib modules can be built in-tree or using PGXS, and the prerequisite is only applicable in the former case. Commit `6697aa2bc2` previously attempted to address this by creating a somewhat fake submake-libpq target in Makefile.global. That was not the right fix, and it was also done in a nonportable way, so revert that.	2014-12-04 07:58:12 -05:00
Alvaro Herrera	73c986adde	Keep track of transaction commit timestamps Transactions can now set their commit timestamp directly as they commit, or an external transaction commit timestamp can be fed from an outside system using the new function TransactionTreeSetCommitTsData(). This data is crash-safe, and truncated at Xid freeze point, same as pg_clog. This module is disabled by default because it causes a performance hit, but can be enabled in postgresql.conf requiring only a server restart. A new test in src/test/modules is included. Catalog version bumped due to the new subdirectory within PGDATA and a couple of new SQL functions. Authors: Álvaro Herrera and Petr Jelínek Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven Singer, Peter Eisentraut	2014-12-03 11:53:02 -03:00
Andres Freund	0fd38e1370	Don't skip SQL backends in logical decoding for visibility computation. The logical decoding patchset introduced PROC_IN_LOGICAL_DECODING flag PGXACT flag, that allows such backends to be skipped when computing the xmin horizon/snapshots. That's fine and sensible for walsenders streaming out logical changes, but not at all fine for SQL backends doing logical decoding. If the latter set that flag any change they have performed outside of logical decoding will not be regarded as visible - which e.g. can lead to that change being vacuumed away. Note that not setting the flag for SQL backends isn't particularly bothersome - the SQL backend doesn't do streaming, so it only runs for a limited amount of time. Per buildfarm member 'tick' and Alvaro. Backpatch to 9.4, where logical decoding was introduced.	2014-12-02 23:47:08 +01:00
Alvaro Herrera	b52cb4690e	pageinspect/BRIN: minor tweaks Michael Paquier Double-dash additions suggested by Peter Geoghegan	2014-12-02 12:20:50 -03:00
Andrew Dunstan	e09996ff8d	Fix hstore_to_json_loose's detection of valid JSON number values. We expose a function IsValidJsonNumber that internally calls the lexer for json numbers. That allows us to use the same test everywhere, instead of inventing a broken test for hstore conversions. The new function is also used in datum_to_json, replacing the code that is now moved to the new function. Backpatch to 9.3 where hstore_to_json_loose was introduced.	2014-12-01 11:28:45 -05:00
Alvaro Herrera	22dfd116a1	Move test modules from contrib to src/test/modules This is advance preparation for introducing even more test modules; the easy solution is to add them to contrib, but that's bloated enough that it seems a good time to think of something different. Moved modules are dummy_seclabel, test_shm_mq, test_parser and worker_spi. (test_decoding was also a candidate, but there was too much opposition to moving that one. We can always reconsider later.)	2014-11-29 23:55:00 -03:00
Tom Lane	f4e031c662	Add bms_next_member(), and use it where appropriate. This patch adds a way of iterating through the members of a bitmapset nondestructively, unlike the old way with bms_first_member(). While bms_next_member() is very slightly slower than bms_first_member() (at least for typical-size bitmapsets), eliminating the need to palloc and pfree a temporary copy of the target bitmapset is a significant win. So this method should be preferred in all cases where a temporary copy would be necessary. Tom Lane, with suggestions from Dean Rasheed and David Rowley	2014-11-28 13:37:25 -05:00
Tom Lane	c168ba3112	Free libxml2/libxslt resources in a safer order. Mark Simonetti reported that libxslt sometimes crashes for him, and that swapping xslt_process's object-freeing calls around to do them in reverse order of creation seemed to fix it. I've not reproduced the crash, but valgrind clearly shows a reference to already-freed memory, which is consistent with the idea that shutdown of the xsltTransformContext is trying to reference the already-freed stylesheet or input document. With this patch, valgrind is no longer unhappy. I have an inquiry in to see if this is a libxslt bug or if we're just abusing the library; but even if it's a library bug, we'd want to adjust our code so it doesn't fail with unpatched libraries. Back-patch to all supported branches, because we've been doing this in the wrong(?) order for a long time.	2014-11-27 11:13:29 -05:00
Heikki Linnakangas	e453cc2741	Make Port->ssl_in_use available, even when built with !USE_SSL Code that check the flag no longer need #ifdef's, which is more convenient. In particular, makes it easier to write extensions that depend on it. In the passing, modify sslinfo's ssl_is_used function to check ssl_in_use instead of the OpenSSL specific 'ssl' pointer. It doesn't make any difference currently, as sslinfo is only compiled when built with OpenSSL, but seems cleaner anyway.	2014-11-25 09:46:11 +02:00
Robert Haas	f5d9698a84	Add infrastructure to save and restore GUC values. This is further infrastructure for parallelism. Amit Khandekar, Noah Misch, Robert Haas	2014-11-24 16:37:56 -05:00
Tom Lane	9c58101117	Fix mishandling of system columns in FDW queries. postgres_fdw would send query conditions involving system columns to the remote server, even though it makes no effort to ensure that system columns other than CTID match what the remote side thinks. tableoid, in particular, probably won't match and might have some use in queries. Hence, prevent sending conditions that include non-CTID system columns. Also, create_foreignscan_plan neglected to check local restriction conditions while determining whether to set fsSystemCol for a foreign scan plan node. This again would bollix the results for queries that test a foreign table's tableoid. Back-patch the first fix to 9.3 where postgres_fdw was introduced. Back-patch the second to 9.2. The code is probably broken in 9.1 as well, but the patch doesn't apply cleanly there; given the weak state of support for FDWs in 9.1, it doesn't seem worth fixing. Etsuro Fujita, reviewed by Ashutosh Bapat, and somewhat modified by me	2014-11-22 16:01:05 -05:00
Heikki Linnakangas	3a82bc6f8a	Add pageinspect functions for inspecting GIN indexes. Patch by me, Peter Geoghegan and Michael Paquier, reviewed by Amit Kapila.	2014-11-21 11:58:07 +02:00
Heikki Linnakangas	2c03216d83	Revamp the WAL record format. Each WAL record now carries information about the modified relation and block(s) in a standardized format. That makes it easier to write tools that need that information, like pg_rewind, prefetching the blocks to speed up recovery, etc. There's a whole new API for building WAL records, replacing the XLogRecData chains used previously. The new API consists of XLogRegister* functions, which are called for each buffer and chunk of data that is added to the record. The new API also gives more control over when a full-page image is written, by passing flags to the XLogRegisterBuffer function. This also simplifies the XLogReadBufferForRedo() calls. The function can dig the relation and block number from the WAL record, so they no longer need to be passed as arguments. For the convenience of redo routines, XLogReader now disects each WAL record after reading it, copying the main data part and the per-block data into MAXALIGNed buffers. The data chunks are not aligned within the WAL record, but the redo routines can assume that the pointers returned by XLogRecGet* functions are. Redo routines are now passed the XLogReaderState, which contains the record in the already-disected format, instead of the plain XLogRecord. The new record format also makes the fixed size XLogRecord header smaller, by removing the xl_len field. The length of the "main data" portion is now stored at the end of the WAL record, and there's a separate header after XLogRecord for it. The alignment padding at the end of XLogRecord is also removed. This compansates for the fact that the new format would otherwise be more bulky than the old format. Reviewed by Andres Freund, Amit Kapila, Michael Paquier, Alvaro Herrera, Fujii Masao.	2014-11-20 18:46:41 +02:00
Robert Haas	a016555361	Avoid file descriptor leak in pg_test_fsync. This can cause problems on Windows, where files that are still open can't be unlinked. Jeff Janes	2014-11-19 12:06:24 -05:00
Alvaro Herrera	f9ef578d05	postgres_fdw.h: don't pull in rel.h when relcache.h is enough	2014-11-14 21:48:53 -03:00
Andres Freund	89fd41b390	Fix and improve cache invalidation logic for logical decoding. There are basically three situations in which logical decoding needs to perform cache invalidation. During/After replaying a transaction with catalog changes, when skipping a uninteresting transaction that performed catalog changes and when erroring out while replaying a transaction. Unfortunately these three cases were all done slightly differently - partially because `8de3e410fa`, which greatly simplifies matters, got committed in the midst of the development of logical decoding. The actually problematic case was when logical decoding skipped transaction commits (and thus processed invalidations). When used via the SQL interface cache invalidation could access the catalog - bad, because we didn't set up enough state to allow that correctly. It'd not be hard to setup sufficient state, but the simpler solution is to always perform cache invalidation outside a valid transaction. Also make the different cache invalidation cases look as similar as possible, to ease code review. This fixes the assertion failure reported by Antonin Houska in 53EE02D9.7040702@gmail.com. The presented testcase has been expanded into a regression test. Backpatch to 9.4, where logical decoding was introduced.	2014-11-13 20:34:31 +01:00
Robert Haas	c0828b78e9	Move the guts of our Levenshtein implementation into core. The hope is that we can use this to produce better diagnostics in some cases. Peter Geoghegan, reviewed by Michael Paquier, with some further changes by me.	2014-11-13 12:33:26 -05:00
Andres Freund	ec5896aed3	Fix several weaknesses in slot and logical replication on-disk serialization. Heikki noticed in 544E23C0.8090605@vmware.com that slot.c and snapbuild.c were missing the FIN_CRC32 call when computing/checking checksums of on disk files. That doesn't lower the the error detection capabilities of the checksum, but is inconsistent with other usages. In a followup mail Heikki also noticed that, contrary to a comment, the 'version' and 'length' struct fields of replication slot's on disk data where not covered by the checksum. That's not likely to lead to actually missed corruption as those fields are cross checked with the expected version and the actual file length. But it's wrong nonetheless. As fixing these issues makes existing on disk files unreadable, bump the expected versions of on disk files for both slots and logical decoding historic catalog snapshots. This means that loading old files will fail with ERROR: "replication slot file ... has unsupported version 1" and ERROR: "snapbuild state file ... has unsupported version 1 instead of 2" respectively. Given the low likelihood of anybody already using these new features in a production setup that seems acceptable. Fixing these issues made me notice that there's no regression test covering the loading of historic snapshot from disk - so add one. Backpatch to 9.4 where these features were introduced.	2014-11-12 18:52:49 +01:00
Andres Freund	bd4ae0f396	Add interrupt checks to contrib/pg_prewarm. Currently the extension's pg_prewarm() function didn't check interrupts once it started "warming" data. Since individual calls can take a long while it's important for them to be interruptible. Backpatch to 9.4 where pg_prewarm was introduced.	2014-11-12 18:52:49 +01:00
Tom Lane	f2ad2bdd0a	Loop when necessary in contrib/pgcrypto's pktreader_pull(). This fixes a scenario in which pgp_sym_decrypt() failed with "Wrong key or corrupt data" on messages whose length is 6 less than a power of 2. Per bug #11905 from Connor Penhale. Fix by Marko Tiikkaja, regression test case from Jeff Janes.	2014-11-11 17:22:15 -05:00
Robert Haas	99e8f08fab	Update pg_xlogdump's .gitignore for brindesc.c.	2014-11-07 15:41:52 -05:00
Alvaro Herrera	7516f52594	BRIN: Block Range Indexes BRIN is a new index access method intended to accelerate scans of very large tables, without the maintenance overhead of btrees or other traditional indexes. They work by maintaining "summary" data about block ranges. Bitmap index scans work by reading each summary tuple and comparing them with the query quals; all pages in the range are returned in a lossy TID bitmap if the quals are consistent with the values in the summary tuple, otherwise not. Normal index scans are not supported because these indexes do not store TIDs. As new tuples are added into the index, the summary information is updated (if the block range in which the tuple is added is already summarized) or not; in the latter case, a subsequent pass of VACUUM or the brin_summarize_new_values() function will create the summary information. For data types with natural 1-D sort orders, the summary info consists of the maximum and the minimum values of each indexed column within each page range. This type of operator class we call "Minmax", and we supply a bunch of them for most data types with B-tree opclasses. Since the BRIN code is generalized, other approaches are possible for things such as arrays, geometric types, ranges, etc; even for things such as enum types we could do something different than minmax with better results. In this commit I only include minmax. Catalog version bumped due to new builtin catalog entries. There's more that could be done here, but this is a good step forwards. Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera, with contribution by Heikki Linnakangas. Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas. Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo. PS: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 318633.	2014-11-07 16:38:14 -03:00
Heikki Linnakangas	2076db2aea	Move the backup-block logic from XLogInsert to a new file, xloginsert.c. xlog.c is huge, this makes it a little bit smaller, which is nice. Functions related to putting together the WAL record are in xloginsert.c, and the lower level stuff for managing WAL buffers and such are in xlog.c. Also move the definition of XLogRecord to a separate header file. This causes churn in the #includes of all the files that write WAL records, and redo routines, but it avoids pulling in xlog.h into most places. Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.	2014-11-06 13:55:36 +02:00
Tom Lane	66c029c842	Fix volatility markings of some contrib I/O functions. In general, datatype I/O functions are supposed to be immutable or at worst stable. Some contrib I/O functions were, through oversight, not marked with any volatility property at all, which made them VOLATILE. Since (most of) these functions actually behave immutably, the erroneous marking isn't terribly harmful; but it can be user-visible in certain circumstances, as per a recent bug report from Joe Van Dyk in which a cast to text was disallowed in an expression index definition. To fix, just adjust the declarations in the extension SQL scripts. If we were being very fussy about this, we'd bump the extension version numbers, but that seems like more trouble (for both developers and users) than the problem is worth. A fly in the ointment is that chkpass_in actually is volatile, because of its use of random() to generate a fresh salt when presented with a not-yet-encrypted password. This is bad because of the general assumption that I/O functions aren't volatile: the consequence is that records or arrays containing chkpass elements may have input behavior a bit different from a bare chkpass column. But there seems no way to fix this without breaking existing usage patterns for chkpass, and the consequences of the inconsistency don't seem bad enough to justify that. So for the moment, just document it in a comment. Since we're not bumping version numbers, there seems no harm in back-patching these fixes; at least future installations will get the functions marked correctly.	2014-11-05 11:34:11 -05:00
Heikki Linnakangas	5028f22f6e	Switch to CRC-32C in WAL and other places. The old algorithm was found to not be the usual CRC-32 algorithm, used by Ethernet et al. We were using a non-reflected lookup table with code meant for a reflected lookup table. That's a strange combination that AFAICS does not correspond to any bit-wise CRC calculation, which makes it difficult to reason about its properties. Although it has worked well in practice, seems safer to use a well-known algorithm. Since we're changing the algorithm anyway, we might as well choose a different polynomial. The Castagnoli polynomial has better error-correcting properties than the traditional CRC-32 polynomial, even if we had implemented it correctly. Another reason for picking that is that some new CPUs have hardware support for calculating CRC-32C, but not CRC-32, let alone our strange variant of it. This patch doesn't add any support for such hardware, but a future patch could now do that. The old algorithm is kept around for tsquery and pg_trgm, which use the values in indexes that need to remain compatible so that pg_upgrade works. While we're at it, share the old lookup table for CRC-32 calculation between hstore, ltree and core. They all use the same table, so might as well.	2014-11-04 11:39:48 +02:00
Tom Lane	f443de873e	Docs: fix incorrect spelling of contrib/pgcrypto option. pgp_sym_encrypt's option is spelled "sess-key", not "enable-session-key". Spotted by Jeff Janes. In passing, improve a comment in pgp-pgsql.c to make it clearer that the debugging options are intentionally undocumented.	2014-11-03 11:11:34 -05:00

... 2 3 4 5 6 ...

3026 Commits