postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-08-07 21:23:29 +02:00

Author	SHA1	Message	Date
Teodor Sigaev	3308467905	Zeroing unused parts ducring tsquery construction. Per investigation failure skink buildfarm member and RANDOMIZE_ALLOCATED_MEMORY help	2016-04-07 20:45:24 +03:00
Tom Lane	f338dd7585	Refactor join_is_removable() to separate out distinctness-proving logic. Extracted from pending unique-join patch, since this is a rather large delta but it's simply moving code out into separately-accessible subroutines. I (tgl) did choose to add a bit more logic to rel_supports_distinctness, so that it verifies that there's at least one potentially usable unique index rather than just checking indexlist != NIL. Otherwise there's no functional change here. David Rowley	2016-04-07 13:12:31 -04:00
Teodor Sigaev	a7ace3b6d9	Make testing of phraseto_tsquery independ from value of default_text_search_config variable. Per skink buldfarm member	2016-04-07 19:33:23 +03:00
Kevin Grittner	fcff8a5751	Detect SSI conflicts before reporting constraint violations While prior to this patch the user-visible effect on the database of any set of successfully committed serializable transactions was always consistent with some one-at-a-time order of execution of those transactions, the presence of declarative constraints could allow errors to occur which were not possible in any such ordering, and developers had no good workarounds to prevent user-facing errors where they were not necessary or desired. This patch adds a check for serialization failure ahead of duplicate key checking so that if a developer explicitly (redundantly) checks for the pre-existing value they will get the desired serialization failure where the problem is caused by a concurrent serializable transaction; otherwise they will get a duplicate key error. While it would be better if the reads performed by the constraints could count as part of the work of the transaction for serialization failure checking, and we will hopefully get there some day, this patch allows a clean and reliable way for developers to work around the issue. In many cases existing code will already be doing the right thing for this to "just work". Author: Thomas Munro, with minor editing of docs by me Reviewed-by: Marko Tiikkaja, Kevin Grittner	2016-04-07 11:12:35 -05:00
Teodor Sigaev	bb140506df	Phrase full text search. Patch introduces new text search operator (<-> or <DISTANCE>) into tsquery. On-disk and binary in/out format of tsquery are backward compatible. It has two side effect: - change order for tsquery, so, users, who has a btree index over tsquery, should reindex it - less number of parenthesis in tsquery output, and tsquery becomes more readable Authors: Teodor Sigaev, Oleg Bartunov, Dmitry Ivanov Reviewers: Alexander Korotkov, Artur Zakirov	2016-04-07 18:44:18 +03:00
Simon Riggs	015e88942a	Load FK defs into relcache for use by planner Fastpath ignores this if no triggers defined. Author: Tomas Vondra, with fastpath and comments added by me Reviewers: David Rowley, Simon Riggs	2016-04-07 12:08:33 +01:00
Noah Misch	f2b1b3079c	Standardize GetTokenInformation() error reporting. Commit `c22650cd64` sparked a discussion about diverse interpretations of "token user" in error messages. Expel old and new specimens of that phrase by making all GetTokenInformation() callers report errors the way GetTokenUser() has been reporting them. These error conditions almost can't happen, so users are unlikely to observe this change. Reviewed by Tom Lane and Stephen Frost.	2016-04-06 23:41:43 -04:00
Noah Misch	33d3fc5e2a	Remove redundant message in AddUserToTokenDacl(). GetTokenUser() will have reported an adequate error message. These error conditions almost can't happen, so users are unlikely to observe this change. Reviewed by Tom Lane and Stephen Frost.	2016-04-06 23:40:51 -04:00
Stephen Frost	29dd1504a1	Bump catversion for pg_dump dump catalog ACL patches Pointed out by Tom.	2016-04-06 23:04:48 -04:00
Stephen Frost	1574783b4c	Use GRANT system to manage access to sensitive functions Now that pg_dump will properly dump out any ACL changes made to functions which exist in pg_catalog, switch to using the GRANT system to manage access to those functions. This means removing 'if (!superuser()) ereport()' checks from the functions themselves and then REVOKEing EXECUTE right from 'public' for these functions in system_views.sql. Reviews by Alexander Korotkov, Jose Luis Tallon	2016-04-06 21:45:32 -04:00
Stephen Frost	23f34fa4ba	In pg_dump, include pg_catalog and extension ACLs, if changed Now that all of the infrastructure exists, add in the ability to dump out the ACLs of the objects inside of pg_catalog or the ACLs for objects which are members of extensions, but only if they have been changed from their original values. The original values are tracked in pg_init_privs. When pg_dump'ing 9.6-and-above databases, we will dump out the ACLs for all objects in pg_catalog and the ACLs for all extension members, where the ACL has been changed from the original value which was set during either initdb or CREATE EXTENSION. This should not change dumps against pre-9.6 databases. Reviews by Alexander Korotkov, Jose Luis Tallon	2016-04-06 21:45:32 -04:00
Stephen Frost	d217b2c360	In pg_dump, split "dump" into "dump" and "dump_contains" Historically, the "dump" component of the namespace has been used to decide if the objects inside of the namespace should be dumped also. Given that "dump" is now a bitmask and may be partial, and we may want to dump out all components of the namespace object but only some of the components of objects contained in the namespace, create a "dump_contains" bitmask which will represent what components of the objects inside of a namespace should be dumped out. No behavior change here, but in preparation for a change where we will dump out just the ACLs of objects in pg_catalog, but we might not dump out the ACL of the pg_catalog namespace itself (for instance, when it hasn't been changed from the value set at initdb time). Reviews by Alexander Korotkov, Jose Luis Tallon	2016-04-06 21:45:32 -04:00
Stephen Frost	a9f0e8e5a2	In pg_dump, use a bitmap to represent what to include pg_dump has historically used a simple boolean 'dump' value to indicate if a given object should be included in the dump or not. Instead, use a bitmap which breaks down the components of an object into their distinct pieces and use that bitmap to only include the components requested. This does not include any behavioral change, but is in preperation for the change to dump out just ACLs for objects in pg_catalog. Reviews by Alexander Korotkov, Jose Luis Tallon	2016-04-06 21:45:32 -04:00
Stephen Frost	6c268df127	Add new catalog called pg_init_privs This new catalog holds the privileges which the system was initialized with at initdb time, along with any permissions set by extensions at CREATE EXTENSION time. This allows pg_dump (and any other similar use-cases) to detect when the privileges set on initdb-created or extension-created objects have been changed from what they were set to at initdb/extension-creation time and handle those changes appropriately. Reviews by Alexander Korotkov, Jose Luis Tallon	2016-04-06 21:45:32 -04:00
Teodor Sigaev	0b62fd036e	Add jsonb_insert It inserts a new value into an jsonb array at arbitrary position or a new key to jsonb object. Author: Dmitry Dolgov Reviewers: Petr Jelinek, Vitaly Burovoy, Andrew Dunstan	2016-04-06 19:25:00 +03:00
Peter Eisentraut	3b3fcc4eea	pg_dump: Add table qualifications to some tags Some object types have names that are only unique for one table. But for those we generally didn't put the table name into the dump TOC tag. So it was impossible to identify these objects if the same name was used for multiple tables. This affects policies, column defaults, constraints, triggers, and rules. Fix by adding the table name to the TOC tag, so that it now reads "$schema $table $object". Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2016-04-06 12:13:11 -04:00
Tom Lane	de94e2af18	Run pgindent on a batch of (mostly-planner-related) source files. Getting annoyed at the amount of unrelated chatter I get from pgindent'ing Rowley's unique-joins patch. Re-indent all the files it touches.	2016-04-06 11:34:02 -04:00
Fujii Masao	ead9963c47	Use proper format specifier %X/%X for LSN, again. Commit `cee31f5` fixed this problem, but commit `989be08` accidentally reverted the fix. Thomas Munro	2016-04-06 22:20:52 +09:00
Simon Riggs	cac0e36682	Revert `bf08f2292f` Remove recent changes to logging XLOG_RUNNING_XACTS by request.	2016-04-06 14:03:46 +01:00
Simon Riggs	3fe3511d05	Generic Messages for Logical Decoding API and mechanism to allow generic messages to be inserted into WAL that are intended to be read by logical decoding plugins. This commit adds an optional new callback to the logical decoding API. Messages are either text or bytea. Messages can be transactional, or not, and are identified by a prefix to allow multiple concurrent decoding plugins. (Not to be confused with Generic WAL records, which are intended to allow crash recovery of extensible objects.) Author: Petr Jelinek and Andres Freund Reviewers: Artur Zakirov, Tomas Vondra, Simon Riggs Discussion: 5685F999.6010202@2ndquadrant.com	2016-04-06 10:05:41 +01:00
Fujii Masao	989be0810d	Support multiple synchronous standby servers. Previously synchronous replication offered only the ability to confirm that all changes made by a transaction had been transferred to at most one synchronous standby server. This commit extends synchronous replication so that it supports multiple synchronous standby servers. It enables users to consider one or more standby servers as synchronous, and increase the level of transaction durability by ensuring that transaction commits wait for replies from all of those synchronous standbys. Multiple synchronous standby servers are configured in synchronous_standby_names which is extended to support new syntax of 'num_sync ( standby_name [ , ... ] )', where num_sync specifies the number of synchronous standbys that transaction commits need to wait for replies from and standby_name is the name of a standby server. The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before is also still supported. It's the same as new syntax with num_sync=1. This commit doesn't include "quorum commit" feature which was discussed in pgsql-hackers. Synchronous standbys are chosen based on their priorities. synchronous_standby_names determines the priority of each standby for being chosen as a synchronous standby. The standbys whose names appear earlier in the list are given higher priority and will be considered as synchronous. Other standby servers appearing later in this list represent potential synchronous standbys. The regression test for multiple synchronous standbys is not included in this commit. It should come later. Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs, Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen, Rajeev Rastogi Many thanks to the various individuals who were involved in discussing and developing this feature.	2016-04-06 17:18:25 +09:00
Alvaro Herrera	f2fcad27d5	Support ALTER THING .. DEPENDS ON EXTENSION This introduces a new dependency type which marks an object as depending on an extension, such that if the extension is dropped, the object automatically goes away; and also, if the database is dumped, the object is included in the dump output. Currently the grammar supports this for indexes, triggers, materialized views and functions only, although the utility code is generic so adding support for more object types is a matter of touching the parser rules only. Author: Abhijit Menon-Sen Reviewed-by: Alexander Korotkov, Álvaro Herrera Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org	2016-04-05 18:38:54 -03:00
Robert Haas	41ea0c2376	Fix parallel-safety code for parallel aggregation. has_parallel_hazard() was ignoring the proparallel markings for aggregates, which is no good. Fix that. There was no way to mark an aggregate as actually being parallel-safe, either, so add a PARALLEL option to CREATE AGGREGATE. Patch by me, reviewed by David Rowley.	2016-04-05 16:06:15 -04:00
Robert Haas	09adc9a8c0	Align all shared memory allocations to cache line boundaries. Experimentation shows this only costs about 6kB, which seems well worth it given the major performance effects that can be caused by insufficient alignment, especially on larger systems. Discussion: 14166.1458924422@sss.pgh.pa.us	2016-04-05 15:47:49 -04:00
Tom Lane	1d2fe56e42	Fix PL/Python for recursion and interleaved set-returning functions. PL/Python failed if a PL/Python function was invoked recursively via SPI, since arguments are passed to the function in its global dictionary (a horrible decision that's far too ancient to undo) and it would delete those dictionary entries on function exit, leaving the outer recursion level(s) without any arguments. Not deleting them would be little better, since the outer levels would then see the innermost level's arguments. Since PL/Python uses ValuePerCall mode for evaluating set-returning functions, it's possible for multiple executions of the same SRF to be interleaved within a query. PL/Python failed in such a case, because it stored only one iterator per function, directly in the function's PLyProcedure struct. Moreover, one interleaved instance of the SRF would see argument values that should belong to another. Hence, invent code for saving and restoring the argument entries. To fix the recursion case, we only need to save at recursive entry and restore at recursive exit, so the overhead in non-recursive cases is negligible. To fix the SRF case, we have to save when suspending a SRF and restore when resuming it, which is potentially not negligible; but fortunately this is mostly a matter of manipulating Python object refcounts and should not involve much physical data copying. Also, store the Python iterator and saved argument values in a structure associated with the SRF call site rather than the function itself. This requires adding a memory context deletion callback to ensure that the SRF state is cleaned up if the calling query exits before running the SRF to completion. Without that we'd leak a refcount to the iterator object in such a case, resulting in session-lifespan memory leakage. (In the pre-existing code, there was no memory leak because there was only one iterator pointer, but what would happen is that the previous iterator would be resumed by the next query attempting to use the SRF. Hardly the semantics we want.) We can buy back some of whatever overhead we've added by getting rid of PLy_function_delete_args(), which seems a useless activity: there is no need to delete argument entries from the global dictionary on exit, since the next time anyone would see the global dict is on the next fresh call of the PL/Python function, at which time we'd overwrite those entries with new arg values anyway. Also clean up some really ugly coding in the SRF implementation, including such gems as returning directly out of a PG_TRY block. (The only reason that failed to crash hard was that all existing call sites immediately exited their own PG_TRY blocks, popping the dangling longjmp pointer before there was any chance of it being used.) In principle this is a bug fix; but it seems a bit too invasive relative to its value for a back-patch, and besides the fix depends on memory context callbacks so it could not go back further than 9.5 anyway. Alexey Grishchenko and Tom Lane	2016-04-05 14:51:19 -04:00
Robert Haas	11c8669c0c	Add parallel query support functions for assorted aggregates. This lets us use parallel aggregate for a variety of useful cases that didn't work before, like sum(int8), sum(numeric), several versions of avg(), and various other functions. Add some regression tests, as well, testing the general sanity of these and future catalog entries. David Rowley, reviewed by Tomas Vondra, with a few further changes by me.	2016-04-05 14:32:53 -04:00
Magnus Hagander	7117685461	Implement backup API functions for non-exclusive backups Previously non-exclusive backups had to be done using the replication protocol and pg_basebackup. With this commit it's now possible to make them using pg_start_backup/pg_stop_backup as well, as long as the backup program can maintain a persistent connection to the database. Doing this, backup_label and tablespace_map are returned as results from pg_stop_backup() instead of being written to the data directory. This makes the server safe from a crash during an ongoing backup, which can be a problem with exclusive backups. The old syntax of the functions remain and work exactly as before, but since the new syntax is safer this should eventually be deprecated and removed. Only reference documentation is included. The main section on backup still needs to be rewritten to cover this, but since that is already scheduled for a separate large rewrite, it's not included in this patch. Reviewed by David Steele and Amit Kapila	2016-04-05 20:03:49 +02:00
Magnus Hagander	9457b591b9	Fix typo Etsuro Fujita	2016-04-05 11:05:01 +02:00
Peter Eisentraut	4dcd4da98c	Fix error message from wal_level value renaming found by Ian Barwick	2016-04-04 21:17:54 -04:00
Tom Lane	99f3b5613b	Disallow newlines in parameter values to be set in ALTER SYSTEM. As noted by Julian Schauder in bug #14063, the configuration-file parser doesn't support embedded newlines in string literals. While there might someday be a good reason to remove that restriction, there doesn't seem to be one right now. However, ALTER SYSTEM SET could accept strings containing newlines, since many of the variable-specific value-checking routines would just see a newline as whitespace. This led to writing a postgresql.auto.conf file that was broken and had to be removed manually. Pending a reason to work harder, just throw an error if someone tries this. In passing, fix several places in the ALTER SYSTEM logic that failed to provide an errcode() for an ereport(), and thus would falsely log the failure as an internal XX000 error. Back-patch to 9.4 where ALTER SYSTEM was introduced.	2016-04-04 18:05:23 -04:00
Alvaro Herrera	890614d2b3	Display WAL pointer in rm_redo error callback This makes it easier to identify the source of a recovery problem in case of a bug or data corruption.	2016-04-04 18:12:12 -03:00
Tom Lane	3c69b33f45	Add a few comments about ANALYZE's strategy for collecting MCVs. Alex Shulgin complained that the underlying strategy wasn't all that apparent, particularly not the fact that we intentionally have two code paths depending on whether we think the column has a limited set of possible values or not. Try to make it clearer.	2016-04-04 17:06:33 -04:00
Tom Lane	391159e03a	Partially revert commit `3d3bf62f30`. On reflection, the pre-existing logic in ANALYZE is specifically meant to compare the frequency of a candidate MCV against the estimated frequency of a random distinct value across the whole table. The change to compare it against the average frequency of values actually seen in the sample doesn't seem very principled, and if anything it would make us less likely not more likely to consider a value an MCV. So revert that, but keep the aspect of considering only nonnull values, which definitely is correct. In passing, rename the local variables in these stanzas to "ndistinct_table", to avoid confusion with the "ndistinct" that appears at an outer scope in compute_scalar_stats.	2016-04-04 16:48:13 -04:00
Alvaro Herrera	c9ff752a85	Silence compiler warning Reported by Peter Eisentraut to occur on 32bit systems	2016-04-04 17:07:23 -03:00
Tom Lane	2bbe9112ae	Add a \gexec command to psql for evaluation of computed queries. \gexec executes the just-entered query, like \g, but instead of printing the results it takes each field as a SQL command to send to the server. Computing a series of queries to be executed is a fairly common thing, but up to now you always had to resort to kluges like writing the queries to a file and then inputting the file. Now it can be done with no intermediate step. The implementation is fairly straightforward except for its interaction with FETCH_COUNT. ExecQueryUsingCursor isn't capable of being called recursively, and even if it were, its need to create a transaction block interferes unpleasantly with the desired behavior of \gexec after a failure of a generated query (i.e., that it can continue). Therefore, disable use of ExecQueryUsingCursor when doing the master \gexec query. We can still apply it to individual generated queries, however, and there might be some value in doing so. While testing this feature's interaction with single-step mode, I (tgl) was led to conclude that SendQuery needs to recognize SIGINT (cancel_pressed) as a negative response to the single-step prompt. Perhaps that's a back-patchable bug fix, but for now I just included it here. Corey Huinker, reviewed by Jim Nasby, Daniel Vérité, and myself	2016-04-04 15:25:16 -04:00
Tom Lane	66229ac004	Introduce a LOG_SERVER_ONLY ereport level, which is never sent to client. This elevel is useful for logging audit messages and similar information that should not be passed to the client. It's equivalent to LOG in terms of decisions about logging priority in the postmaster log, but messages with this elevel will never be sent to the client. In the current implementation, it's just an alias for the longstanding COMMERROR elevel (or more accurately, we've made COMMERROR an alias for this). At some point it might be interesting to allow a LOG_ONLY flag to be attached to any elevel, but that would be considerably more complicated, and it's not clear there's enough use-cases to justify the extra work. For now, let's just take the easy 90% solution. David Steele, reviewed by Fabien Coelho, Petr Jelínek, and myself	2016-04-04 12:32:42 -04:00
Tom Lane	58666ed28a	Fix latent portability issue in pgwin32_dispatch_queued_signals(). The first iteration of the signal-checking loop would compute sigmask(0) which expands to 1<<(-1) which is undefined behavior according to the C standard. The lack of field reports of trouble suggest that it evaluates to 0 on all existing Windows compilers, but that's hardly something to rely on. Since signal 0 isn't a queueable signal anyway, we can just make the loop iterate from 1 instead, and save a few cycles as well as avoiding the undefined behavior. In passing, avoid evaluating the volatile expression UNBLOCKED_SIGNAL_QUEUE twice in a row; there's no reason to waste cycles like that. Noted by Aleksander Alekseev, though this isn't his proposed fix. Back-patch to all supported branches.	2016-04-04 11:13:17 -04:00
Dean Rasheed	84f9a35e39	Improve estimate of distinct values in estimate_num_groups(). When adjusting the estimate for the number of distinct values from a rel in a grouped query to take into account the selectivity of the rel's restrictions, use a formula that is less likely to produce under-estimates. The old formula simply multiplied the number of distinct values in the rel by the restriction selectivity, which would be correct if the restrictions were fully correlated with the grouping expressions, but can produce significant under-estimates in cases where they are not well correlated. The new formula is based on the random selection probability, and so assumes that the restrictions are not correlated with the grouping expressions. This is guaranteed to produce larger estimates, and of course risks over-estimating in cases where the restrictions are correlated, but that has less severe consequences than under-estimating, which might lead to a HashAgg that consumes an excessive amount of memory. This could possibly be improved upon in the future by identifying correlated restrictions and using a hybrid of the old and new formulae. Author: Tomas Vondra, with some hacking be me Reviewed-by: Mark Dilger, Alexander Korotkov, Dean Rasheed and Tom Lane Discussion: http://www.postgresql.org/message-id/flat/56CD0381.5060502@2ndquadrant.com	2016-04-04 12:41:56 +01:00
Simon Riggs	bf08f2292f	Avoid archiving XLOG_RUNNING_XACTS on idle server If archive_timeout > 0 we should avoid logging XLOG_RUNNING_XACTS if idle. Bug 13685 reported by Laurence Rowe, investigated in detail by Michael Paquier, though this is not his proposed fix. 20151016203031.3019.72930@wrigleys.postgresql.org Simple non-invasive patch to allow later backpatch to 9.4 and 9.5	2016-04-04 07:18:05 +01:00
Simon Riggs	3e4b7d8798	Avoid pin scan for replay of XLOG_BTREE_VACUUM in all cases Replay of XLOG_BTREE_VACUUM during Hot Standby was previously thought to require complex interlocking that matched the requirements on the master. This required an O(N) operation that became a significant problem with large indexes, causing replication delays of seconds or in some cases minutes while the XLOG_BTREE_VACUUM was replayed. This commit skips the pin scan that was previously required, by observing in detail when and how it is safe to do so, with full documentation. The pin scan is skipped only in replay; the VACUUM code path on master is not touched here and WAL is identical. The current commit applies in all cases, effectively replacing commit `687f2cd7a0`.	2016-04-03 17:46:09 +01:00
Tom Lane	3cc38ca7d2	Add psql \errverbose command to see last server error at full verbosity. Often, upon getting an unexpected error in psql, one's first wish is that the verbosity setting had been higher; for example, to be able to see the schema-name field or the server code location info. Up to now the only way has been to adjust the VERBOSITY variable and repeat the failing query. That's a pain, and it doesn't work if the error isn't reproducible. This commit adds a psql feature that redisplays the most recent server error at full verbosity, without needing to make any variable changes or re-execute the failed command. We just need to hang onto the latest error PGresult in case the user executes \errverbose, and then apply libpq's new PQresultVerboseErrorMessage() function to it. This will consume some trivial amount of psql memory, but otherwise the cost when the feature isn't used should be negligible. Alex Shulgin, reviewed by Daniel Vérité, some improvements by me	2016-04-03 12:29:55 -04:00
Tom Lane	e3161b231c	Add libpq support for recreating an error message with different verbosity. Often, upon getting an unexpected error in psql, one's first wish is that the verbosity setting had been higher; for example, to be able to see the schema-name field or the server code location info. Up to now the only way has been to adjust the VERBOSITY variable and repeat the failing query. That's a pain, and it doesn't work if the error isn't reproducible. This commit adds support in libpq for regenerating the error message for an existing error PGresult at any desired verbosity level. This is almost just a matter of refactoring the existing code into a subroutine, but there is one bit of possibly-needed information that was not getting put into PGresults: the text of the last query sent to the server. We must add that string to the contents of an error PGresult. But we only need to save it if it might be used, which with the existing error-formatting code only happens if there is a PG_DIAG_STATEMENT_POSITION error field, which is probably pretty rare for errors in production situations. So really the overhead when the feature isn't used should be negligible. Alex Shulgin, reviewed by Daniel Vérité, some improvements by me	2016-04-03 12:24:54 -04:00
Tom Lane	a1953f3a60	Make all the declarations of WaitEventSetWaitBlock be marked "inline". The inconsistency here triggered compiler warnings on some buildfarm members, and it's surely pretty pointless.	2016-04-02 13:55:44 -04:00
Tom Lane	45aae8e789	Suppress compiler warning. Some buildfarm members are showing "comparison is always false due to limited range of data type" complaints on this test, so #ifdef it out on machines with 32-bit int.	2016-04-02 13:49:17 -04:00
Stephen Frost	62b5cd234b	Fix typo in pg_regress.c s/afer/after Pointed out by Andreas 'ads' Scherbaum	2016-04-02 11:12:17 -04:00
Noah Misch	c22650cd64	Refer to a TOKEN_USER payload as a "token user," not as a "user token". This corrects messages for can't-happen errors. The corresponding "user token" appears in the HANDLE argument of GetTokenInformation().	2016-04-01 21:53:18 -04:00
Noah Misch	4ad6f13500	Copyedit comments and documentation.	2016-04-01 21:53:10 -04:00
Alvaro Herrera	f07d18b6e9	test_slot_timelines: Fix alternate expected output	2016-04-01 18:36:07 -03:00
Tom Lane	3d3bf62f30	Omit null rows when setting the threshold for what's a most-common value. As with the previous patch, large numbers of null rows could skew this calculation unfavorably, causing us to discard values that have a legitimate claim to be MCVs, since our definition of MCV is that it's most common among the non-null population of the column. Hence, make the numerator of avgcount be the number of non-null sample values not the number of sample rows; likewise for maxmincount in the compute_scalar_stats variant. Also, make the denominator be the number of distinct values actually observed in the sample, rather than reversing it back out of the computed stadistinct. This avoids depending on the accuracy of the Haas-Stokes approximation, and really it's what we want anyway; the threshold should depend only on what we see in the sample, not on what we extrapolate about the contents of the whole column. Alex Shulgin, reviewed by Tomas Vondra and myself	2016-04-01 17:03:27 -04:00
Alvaro Herrera	5cb882675a	pgbench: Remove unused parameter For some reason this parameter was introduced as unused in `3da0dfb4b1`, and has never been used for anything. Remove it. Author: Fabien Coelho	2016-04-01 17:11:18 -03:00
Tom Lane	be4b4dc759	Omit null rows when applying the Haas-Stokes estimator for ndistinct. Previously, we included null rows in the values of n and N that went into the formula, which amounts to considering null as a value in its own right; but the d and f1 values do not include nulls. This is inconsistent, and it contributes to significant underestimation of ndistinct when the column is mostly nulls. In any case stadistinct is defined as the number of distinct non-null values, so we should exclude nulls when doing this computation. This is an aboriginal bug in our application of the Haas-Stokes formula, but we'll refrain from back-patching for fear of destabilizing plan choices in released branches. While at it, make the code a bit more readable by omitting unnecessary casts and intermediate variables. Observation and original patch by Tomas Vondra, adjusted to fix both uses of the formula by Alex Shulgin, cosmetic improvements by me	2016-04-01 15:48:24 -04:00
Alvaro Herrera	82c83b3372	Fix logical_decoding_timelines test crashes In the test_slot_timelines test module, we were abusing passing NULL values which was received as zeroes in x86, but this breaks in ARM (buildfarm member hamster) by crashing instead. Fix the breakage by marking these functions as STRICT; the InvalidXid value that was previously implicit in NULL values (on x86 at least) can now be passed as 0. Failing to follow the fmgr protocol to check for NULLs beforehand was causing ARM to fail, as evidenced by segmentation faults in buildfarm member hamster. In order to use the new functionality in the test script, use COALESCE in the right spot to avoid forwarding NULL values. This was diagnosed from the hamster crash by Craig Ringer, who also proposed a different patch (checking for NULL values explicitely in the C function code, and keeping the non-strictness in the C functions). I decided to go with this approach instead.	2016-04-01 16:47:00 -03:00
Alvaro Herrera	f402b99501	Type names should not be quoted Our actual convention, contrary to what I said in `59a2111b23`, is not to quote type names, as evidenced by unquoted use of format_type_be() result value in error messages. Remove quotes from recently tweaked messages accordingly. Per note from Tom Lane	2016-04-01 13:35:48 -03:00
Tom Lane	a067b50470	Get rid of minus zero in box regression test. Commit `acdf2a8b` added a test case involving minus zero as a box endpoint. This is not very portable, as evidenced by the several older buildfarm members that are failing on the test because they print minus zero as just "0". If there were any significant reason to test this behavior, we could consider carrying a separate expected-file; but it doesn't look to me like there's adequate justification to accept such a maintenance burden. Just change the test to use plain zero, instead.	2016-04-01 12:25:17 -04:00
Tom Lane	2306696004	Fix oversight in getParamDescriptions(), and improve comments. When getParamDescriptions was changed to handle out-of-memory better by cribbing error recovery logic from getRowDescriptions/getAnotherTuple, somebody omitted to copy the stanza about checking for excess data in the message. But you need to do that, since continue'ing out of the switch in pqParseInput3 means no such check gets applied there anymore. Noted while looking at Michael Paquier's patch that made yet another copy of this advance_and_error logic. (This whole business desperately needs refactoring, because I sure don't want to see a dozen copies of this code, but that's where we seem to be headed. What's more, the "suspend parsing on EOF return" convention is a holdover from protocol 2 and shouldn't exist at all in protocol 3, because we don't process partial messages anymore. But for now, just fix the obvious bug.) Also, fix some wrong/missing comments about what the API spec is for these three functions. This doesn't seem worthy of back-patching, even though it's a bug; the case shouldn't ever arise in the field.	2016-04-01 12:14:16 -04:00
Teodor Sigaev	65578341af	Add Generic WAL interface This interface is designed to give an access to WAL for extensions which could implement new access method, for example. Previously it was impossible because restoring from custom WAL would need to access system catalog to find a redo custom function. This patch suggests generic way to describe changes on page with standart layout. Bump XLOG_PAGE_MAGIC because of new record type. Author: Alexander Korotkov with a help of Petr Jelinek, Markus Nullmeier and minor editorization by my Reviewers: Petr Jelinek, Alvaro Herrera, Teodor Sigaev, Jim Nasby, Michael Paquier	2016-04-01 12:21:48 +03:00
Tom Lane	c202ecf902	Another zic portability fix. I should have remembered that we can't use INT64_MODIFIER with sscanf(): configure chooses that to work with snprintf(), but it might be for our src/port/snprintf.c implementation and so not compatible with the platform's sscanf(). This appears to be the explanation for buildfarm member frogmouth's continuing unhappiness with the tzcode update. Fortunately, in all of the places where zic is attempting to read into an int64 variable, it's reading a year which certainly will fit just fine into an int. So make it read into an int with %d, and then cast or copy as necessary.	2016-03-31 16:14:55 -04:00
Alvaro Herrera	61608d3836	Fix recovery_min_apply_delay test Previously this test was relying too much on WAL replay to occur in the exact configured interval, which was unreliable on slow or overly busy servers. Use a custom loop instead of poll_query_until, which is hopefully more reliable. Per continued failures on buildfarm member hamster (which is probably the only one running this test suite) Author: Michaël Paquier	2016-03-31 16:06:32 -03:00
Tom Lane	f9aefcb91f	Support using index-only scans with partial indexes in more cases. Previously, the planner would reject an index-only scan if any restriction clause for its table used a column not available from the index, even if that restriction clause would later be dropped from the plan entirely because it's implied by the index's predicate. This is a fairly common situation for partial indexes because predicates using columns not included in the index are often the most useful kind of predicate, and we have to duplicate (or at least imply) the predicate in the WHERE clause in order to get the index to be considered at all. So index-only scans were essentially unavailable with such partial indexes. To fix, we have to do detection of implied-by-predicate clauses much earlier in the planner. This patch puts it in check_index_predicates (nee check_partial_indexes), meaning it gets done for every partial index, whereas we previously only considered this issue at createplan time, so that the work was only done for an index actually selected for use. That could result in a noticeable planning slowdown for queries against tables with many partial indexes. However, testing suggested that there isn't really a significant cost, especially not with reasonable numbers of partial indexes. We do get a small additional benefit, which is that cost_index is more accurate since it correctly discounts the evaluation cost of clauses that will be removed. We can also avoid considering such clauses as potential indexquals, which saves useless matching cycles in the case where the predicate columns aren't in the index, and prevents generating bogus plans that double-count the clause's selectivity when the columns are in the index. Tomas Vondra and Kyotaro Horiguchi, reviewed by Kevin Grittner and Konstantin Knizhnik, and whacked around a little by me	2016-03-31 14:49:10 -04:00
Alvaro Herrera	3501f71c21	Fix broken variable declaration Author: Konstantin Knizhnik	2016-03-30 23:39:15 -03:00
Alvaro Herrera	3dd0792ae0	Blind attempt at fixing Win32 issue on `24c5f1a103` As best as I can tell, MyReplicationSlot needs to be PGDLLIMPORT in order for the new test_slot_timelines test module to compile. Per buildfarm	2016-03-30 23:12:20 -03:00
Fujii Masao	cee31f5fee	Use proper format specifier %X/%X for LSN.	2016-03-31 11:03:40 +09:00
Alvaro Herrera	3a3b309041	I forgot the alternate expected file in previous commit Without this, the test_slot_timelines modules fails "make installcheck" because the required feature is not enabled in a stock server. Per buildfarm	2016-03-30 20:48:24 -03:00
Alvaro Herrera	24c5f1a103	Enable logical slots to follow timeline switches When decoding from a logical slot, it's necessary for xlog reading to be able to read xlog from historical (i.e. not current) timelines; otherwise, decoding fails after failover, because the archives are in the historical timeline. This is required to make "failover logical slots" possible; it currently has no other use, although theoretically it could be used by an extension that creates a slot on a standby and continues to replay from the slot when the standby is promoted. This commit includes a module in src/test/modules with functions to manipulate the slots (which is not otherwise possible in SQL code) in order to enable testing, and a new test in src/test/recovery to ensure that the behavior is as expected. Author: Craig Ringer Reviewed-By: Oleksii Kliukin, Andres Freund, Petr Jelínek	2016-03-30 20:07:05 -03:00
Alvaro Herrera	3b02ea4f07	XLogReader general code cleanup Some minor tweaks and comment additions, for cleanliness sake and to avoid having the upcoming timeline-following patch be polluted with unrelated cleanup. Extracted from a larger patch by Craig Ringer, reviewed by Andres Freund, with some additions by myself.	2016-03-30 18:56:13 -03:00
Tom Lane	50861cd683	Improve portability of I/O behavior for the geometric types. Formerly, the geometric I/O routines such as box_in and point_out relied directly on strtod() and sprintf() for conversion of the float8 component values of their data types. However, the behavior of those functions is pretty platform-dependent, especially for edge-case values such as infinities and NaNs. This was exposed by commit `acdf2a8b37`, which added test cases involving boxes with infinity endpoints, and immediately failed on Windows and AIX buildfarm members. We solved these problems years ago in the main float8in and float8out functions, so let's fix it by making the geometric types use that code instead of depending directly on the platform-supplied functions. To do this, refactor the float8in code so that it can be used to parse just part of a string, and as a convenience make the guts of float8out usable without going through DirectFunctionCall. While at it, get rid of geo_ops.c's fairly shaky assumptions about the maximum output string length for a double, by having it build results in StringInfo buffers instead of fixed-length strings. In passing, convert all the "invalid input syntax for type foo" messages in this area of the code into "invalid input syntax for type %s" to reduce the number of distinct translatable strings, per recent discussion. We would have needed a fair number of the latter anyway for code-sharing reasons, so we might as well just go whole hog. Note: this patch is by no means intended to guarantee that the geometric types uniformly behave sanely for infinity or NaN component values. But any bugs we have in that line were there all along, they were just harder to reach in a platform-independent way.	2016-03-30 17:25:03 -04:00
Tom Lane	818e593736	Suppress uninitialized-variable warnings. My compiler doesn't like the lack of initialization of "flag", and I think it's right: if there were zero keys we'd have an undefined result. The AND of zero items is TRUE, so initialize to TRUE.	2016-03-30 13:36:18 -04:00
Teodor Sigaev	2d02a856e8	Bump catalog version, forget in `acdf2a8b37`	2016-03-30 18:56:21 +03:00
Teodor Sigaev	acdf2a8b37	Introduce SP-GiST operator class over box. Patch implements quad-tree over boxes, naive approach of 2D quad tree will not work for any non-point objects because splitting space on node is not efficient. The idea of pathc is treating 2D boxes as 4D points, so, object will not overlap (in 4D space). The performance tests reveal that this technique especially beneficial with too much overlapping objects, so called "spaghetti data". Author: Alexander Lebedev with editorization by Emre Hasegeli and me	2016-03-30 18:42:36 +03:00
Teodor Sigaev	87545f5412	Use traversalValue in SP-GiST range opclass. Author: Alexander Lebedev	2016-03-30 18:38:53 +03:00
Teodor Sigaev	ccd6eb49a4	Introduce traversalValue for SP-GiST scan During scan sometimes it would be very helpful to know some information about parent node or all ancestor nodes. Right now reconstructedValue could be used but it's not a right usage of it (range opclass uses that). traversalValue is arbitrary piece of memory in separate MemoryContext while reconstructedVale should have the same type as indexed column. Subsequent patches for range opclass and quad4d tree will use it. Author: Alexander Lebedev, Teodor Sigaev	2016-03-30 18:29:28 +03:00
Magnus Hagander	3063e7a840	Add missing gss option to msvc config template Michael Paquier	2016-03-30 10:49:44 +02:00
Tom Lane	c53ab8a3af	Remove just-added tests for to_timestamp(float8) with out-of-range inputs. Reporting the specific out-of-range input value produces platform-dependent results. We could skip reporting the value, but that's contrary to our message style guidelines and unhelpful to users. Or we could add a separate expected-output file for Windows, but that would be a substantial maintenance burden, and these test cases seem unlikely to be worth it. Per buildfarm.	2016-03-29 22:23:32 -04:00
Robert Haas	314cbfc5da	Add new replication mode synchronous_commit = 'remote_apply'. In this mode, the master waits for the transaction to be applied on the remote side, not just written to disk. That means that you can count on a transaction started on the standby to see all commits previously acknowledged by the master. To make this work, the standby sends a reply after replaying each commit record generated with synchronous_commit >= 'remote_apply'. This introduces a small inefficiency: the extra replies will be sent even by standbys that aren't the current synchronous standby. But previously-existing synchronous_commit levels make no attempt at all to optimize which replies are sent based on what the primary cares about, so this is no worse, and at least avoids any extra replies for people not using the feature at all. Thomas Munro, reviewed by Michael Paquier and by me. Some additional tweaks by me.	2016-03-29 21:29:49 -04:00
Tom Lane	a898b409f6	Fix interval_mul() to not produce insane results. interval_mul() attempts to prevent its calculations from producing silly results, but it forgot that zero times infinity yields NaN in IEEE arithmetic. Hence, a case like '1 second'::interval * 'infinity'::float8 produced a NaN for the months product, which didn't trigger the range check, resulting in bogus and possibly platform-dependent output. This isn't terribly obvious to the naked eye because if you try that exact case, you get "interval out of range" which is what you expect --- but if you look closer, the error is coming from interval_out not interval_mul. interval_mul has allowed a bogus value into the system. Fix by adding isnan tests. Noted while testing Vitaly Burovoy's fix for infinity input to to_timestamp(). Given the lack of field complaints, I doubt this is worth a back-patch.	2016-03-29 17:21:12 -04:00
Tom Lane	e511d878f3	Allow to_timestamp(float8) to convert float infinity to timestamp infinity. With the original SQL-function implementation, such cases failed because we don't support infinite intervals. Converting the function to C lets us bypass the interval representation, which should be a bit faster as well as more flexible. Vitaly Burovoy, reviewed by Anastasia Lubennikova	2016-03-29 17:09:29 -04:00
Robert Haas	96f8373cad	Fix bug in aggregate (de)serialization commit. resulttypeLen and resulttypeByVal must be set correctly when serializing aggregates, not just when finalizing them. This was in David's final patch but I downloaded the wrong version by mistake and failed to spot the error. David Rowley	2016-03-29 15:21:57 -04:00
Robert Haas	5fe5a2cee9	Allow aggregate transition states to be serialized and deserialized. This is necessary infrastructure for supporting parallel aggregation for aggregates whose transition type is "internal". Such values can't be passed between cooperating processes, because they are just pointers. David Rowley, reviewed by Tomas Vondra and by me.	2016-03-29 15:04:05 -04:00
Alvaro Herrera	a1c935d3b7	pgbench: allow a script weight of zero This refines the previous weight range and allows a script to be "turned off" by passing a zero weight, which is useful when scripting multiple pgbench runs. I did not apply the suggested warning when a script uses zero weight; we use the principle elsewhere that if there's nothing to be done, do nothing quietly. Adjust docs accordingly. Author: Jeff Janes, Fabien Coelho	2016-03-29 14:47:10 -03:00
Robert Haas	ad9566470b	pgbench: Remove \setrandom. You can now do the same thing via \set using the appropriate function, either random(), random_gaussian(), or random_exponential(), depending on the desired distribution. This is not backward-compatible, but per discussion, it's worth it to avoid having the old syntax hang around forever. Fabien Coelho, reviewed by Michael Paquier, and adjusted by me.	2016-03-29 12:08:49 -04:00
Tom Lane	7abc157165	Avoid possibly-unsafe use of Windows' FormatMessage() function. Whenever this function is used with the FORMAT_MESSAGE_FROM_SYSTEM flag, it's good practice to include FORMAT_MESSAGE_IGNORE_INSERTS as well. Otherwise, if the message contains any %n insertion markers, the function will try to fetch argument strings to substitute --- which we are not passing, possibly leading to a crash. This is exactly analogous to the rule about not giving printf() a format string you're not in control of. Noted and patched by Christian Ullrich. Back-patch to all supported branches.	2016-03-29 11:55:19 -04:00
Teodor Sigaev	61d66c44f1	Fix support of digits in email/hostnames. When tsearch was implemented I did several mistakes in hostname/email definition rules: 1) allow underscore in hostname what prohibited by RFC 2) forget to allow leading digits separated by hyphen (like 123-x.com) in hostname 3) do no allow underscore/hyphen after leading digits in localpart of email Artur's patch resolves two last issues, but by the way allows hosts name like 123_x.com together with 123-x.com. RFC forbids underscore usage in hostname but pg allows that since initial tsearch version in core, although only for non-digits. Patch syncs support digits and nondigits in both hostname and email. Forbidding underscore in hostname may break existsing usage of tsearch and, anyhow, it should be done by separate patch. Author: Artur Zakirov BUG: #13964	2016-03-29 18:28:49 +03:00
Robert Haas	f9143d102f	Rework custom scans to work more like the new extensible node stuff. Per discussion, the new extensible node framework is thought to be better designed than the custom path/scan/scanstate stuff we added in PostgreSQL 9.5. Rework the latter to be more like the former. This is not backward-compatible, but we generally don't promise that for C APIs, and there probably aren't many people using this yet anyway. KaiGai Kohei, reviewed by Petr Jelinek and me. Some further cosmetic changes by me.	2016-03-29 11:28:04 -04:00
Tom Lane	534da37927	Protect zic's symlink() call with #ifdef HAVE_SYMLINK. The IANA crew seem to think that symlink() exists everywhere nowadays, and they may well be right. But we use #ifdef HAVE_SYMLINK elsewhere so for consistency we should do it here too. Noted by Michael Paquier.	2016-03-29 11:06:44 -04:00
Tom Lane	6d257e732b	Fix zic for Windows. The new coding of dolink() is dependent on link() returning an on-point errno when it fails; but the quick-hack implementation of link() that we'd put in for Windows didn't bother with setting errno. Fix that. Analysis and patch by Christian Ullrich.	2016-03-29 10:40:08 -04:00
Tom Lane	656ee84890	Fix portability issues in `86c43f4e22`. INT64_MIN/MAX should be spelled PG_INT64_MIN/MAX, per well established convention in our sources. Less obviously, a symbol named DOUBLE causes problems on Windows builds, so rename that to DOUBLE_CONST; and rename INTEGER to INTEGER_CONST for consistency. Also, get rid of incorrect/obsolete hand-munging of yycolumn, and fix the grammar for float constants to handle expected cases such as ".1". First two items by Michael Paquier, second two by me.	2016-03-29 00:53:53 -04:00
Robert Haas	5d4171d1c7	Don't require a user mapping for FDWs to work. Commit `fbe5a3fb73` accidentally changed this behavior; put things back the way they were, and add some regression tests. Report by Andres Freund; patch by Ashutosh Bapat, with a bit of kibitzing by me.	2016-03-28 21:50:28 -04:00
Robert Haas	868628e4fd	On all Windows platforms, not just Cygwin, use _timezone and _tzname. Up until now, we've been using timezone and tzname, but Visual Studio 2015 (for which we wish to add support) no longer declares those symbols. All versions since Visual Studio 2003 apparently support the underscore-equipped names, and we don't support anything older than Visual Studio 2005, so this should work OK everywhere. But let's see what the buildfarm thinks. Michael Paquier, reviewed by Petr Jelinek	2016-03-28 20:59:25 -04:00
Robert Haas	bd0f206f55	Fix typo in comment. Thomas Munro	2016-03-28 20:55:15 -04:00
Robert Haas	86c43f4e22	pgbench: Support double constants and functions. The new functions are pi(), random(), random_exponential(), random_gaussian(), and sqrt(). I was worried that this would be slower than before, but, if anything, it actually turns out to be slightly faster, because we now express the built-in pgbench scripts using fewer lines; each \setrandom can be merged into a subsequent \set. Fabien Coelho	2016-03-28 20:45:57 -04:00
Alvaro Herrera	9bd61311bd	PostgresNode: initialize $timed_out if passed Corrects an oversight in `2c83f435a3` where the $timed_out reference var isn't initialized; using it would require the caller to initialize it beforehand, which is cumbersome. Author: Craig Ringer	2016-03-28 19:17:06 -03:00
Tom Lane	1f4e9da624	Sync tzload() and tzparse() APIs with IANA release tzcode2016c. This brings us a bit closer to matching upstream, but since it affects files outside src/timezone/, we might choose not to back-patch it. Hence keep it separate from the main update patch.	2016-03-28 17:19:29 -04:00
Tom Lane	f5f15ea6aa	Fix MSVC build for changes in zic. zic now only needs zic.c, but I didn't realize knowledge about it was hardwired into Mkvcbuild.pm. Per buildfarm.	2016-03-28 16:02:07 -04:00
Tom Lane	1c1a7cbd6a	Sync our copy of the timezone library with IANA release tzcode2016c. We hadn't done this in about six years, which proves to have been a mistake because there's been a lot of code churn upstream, making the merge rather painful. But putting it off any further isn't going to lessen the pain, and there are at least two incompatible changes that we need to absorb before someone starts complaining that --with-system-tzdata doesn't work at all on their platform, or we get blindsided by a tzdata release that our out-of-date zic can't compile. Last week's "time zone abbreviation differs from POSIX standard" mess was a wake-up call in that regard. This is a sufficiently large patch that I'm afraid to back-patch it immediately, though the foregoing considerations imply that we probably should do so eventually. For the moment, just put it in HEAD so that it can get some testing. Maybe we can wait till the end of the 9.6 beta cycle before deeming it okay.	2016-03-28 15:10:17 -04:00
Tom Lane	e5a4dea80f	Document errhidecontext() where it ought to be documented. Seems to have been missed when this function was added. Noted while looking at David Steele's proposal to add another similar function.	2016-03-28 14:18:14 -04:00
Alvaro Herrera	4b746f0d07	Update expected file from quoting change I neglected to update this in `59a2111b23`. Per buildfarm	2016-03-28 14:40:32 -03:00
Alvaro Herrera	cad3edef4f	pg_rewind: Improve internationalization This is mostly cosmetic since two of the three changes are debug messages, and the third one is just a progress indicator. Author: Michaël Paquier	2016-03-28 14:33:00 -03:00
Alvaro Herrera	37732a2555	Fix minor leak in pg_dump for ACCESS METHOD. Bug reported by Coverity. Author: Michaël Paquier	2016-03-28 14:27:41 -03:00
Alvaro Herrera	59a2111b23	Improve internationalization of messages involving type names Change the slightly different variations of the message function FOO must return type BAR to a single wording, removing the variability in type name so that they all create a single translation entry; since the type name is not to be translated, there's no point in it being part of the message anyway. Also, change them all to use the same quoting convention, namely that the function name is not to be quoted but the type name is. (I'm not quite sure why this is so, but it's the clear majority.) Some similar messages such as "encoding conversion function FOO must ..." are also changed.	2016-03-28 14:24:37 -03:00
Teodor Sigaev	559e7a0a6d	psql tab-complete for CREATE/DROP ACCESS METHOD Alexander Korotkov	2016-03-28 19:32:13 +03:00
Teodor Sigaev	dabd255d58	Fix comment in pg_dump. It was missed in `473b932870`, CREATE ACCESS METHOD Alexander Korotkov	2016-03-28 19:17:28 +03:00
Stephen Frost	86ebf30fd6	Reset plan->row_security_env and planUserId In the plancache, we check if the environment we planned the query under has changed in a way which requires us to re-plan, such as when the user for whom the plan was prepared changes and RLS is being used (and, therefore, there may be different policies to apply). Unfortunately, while those values were set and checked, they were not being reset when the query was re-planned and therefore, in cases where we change role, re-plan, and then change role again, we weren't re-planning again. This leads to potentially incorrect policies being applied in cases where role-specific policies are used and a given query is planned under one role and then executed under other roles, which could happen under security definer functions or when a common user and query is planned initially and then re-used across multiple SET ROLEs. Further, extensions which made use of CopyCachedPlan() may suffer from similar issues as the RLS-related fields were not properly copied as part of the plan and therefore RevalidateCachedQuery() would copy in the current settings without invalidating the query. Fix by using the same approach used for 'search_path', where we set the correct values in CompleteCachedPlan(), check them early on in RevalidateCachedQuery() and then properly reset them if re-planning. Also, copy through the values during CopyCachedPlan(). Pointed out by Ashutosh Bapat. Reviewed by Michael Paquier. Back-patch to 9.5 where RLS was introduced. Security: CVE-2016-2193	2016-03-28 09:03:20 -04:00
Tom Lane	d12e5bb79b	Code and docs review for commit `3187d6de0e`. Fix up check for high-bit-set characters, which provoked "comparison is always true due to limited range of data type" warnings on some compilers, and was unlike the way we do it elsewhere anyway. Fix omission of "$" from the set of valid identifier continuation characters. Get rid of sanitize_text(), which was utterly inconsistent with any other error report anywhere in the system, and wasn't even well designed on its own terms (double-quoting the result string without escaping contained double quotes doesn't seem very well thought out). Fix up error messages, which didn't follow the message style guidelines very well, and were overly specific in situations where the actual mistake might not be what they said. Improve documentation. (I started out just intending to fix the compiler warning, but the more I looked at the patch the less I liked it.)	2016-03-28 01:00:30 -04:00
Tom Lane	d65b665d52	Guard against zero vardata.rel->tuples in estimate_hash_bucketsize(). If the referenced rel was proven empty, we'd compute 0/0 here, which results in the function returning NaN. That's a bit more serious than the other zero-divide case. Still, it only seems to be possible in HEAD, so no back-patch. Per report from Piotr Stefaniak. I looked through the rest of selfuncs.c and found no other likely trouble spots.	2016-03-27 18:21:03 -04:00
Tom Lane	fa09f89351	Clamp adjusted ndistinct to positive integer in estimate_hash_bucketsize(). This avoids a possible divide-by-zero in the following calculation, and rounding the number to an integer seems like saner behavior anyway. Assuming IEEE math, the division would yield +Infinity which would get replaced by 1.0 at the bottom of the function, so nothing really interesting would ensue; but avoiding divide-by-zero seems like a good idea on general principles. Per report from Piotr Stefaniak. No back-patch since this seems mostly cosmetic.	2016-03-27 18:07:16 -04:00
Andres Freund	408f043853	pg_rewind: fsync target data directory. Previously pg_rewind did not fsync any files. That's problematic, given that the target directory is modified. If the database was started afterwards, `2ce439f33` luckily already caused the data directory to be synced to disk at postmaster startup; reducing the scope of the problem. To fix, use initdb -S, at the end of the pg_rewind run. It doesn't seem worthwhile to duplicate the code into pg_rewind, and initdb -S is already used that way by pg_upgrade. Reported-By: Andres Freund Author: Michael Paquier, somewhat edited by me Discussion: 20160310034352.iuqgvpmg5qmnxtkz@alap3.anarazel.de CAB7nPqSytVG1o4S3S2pA1O=692ekurJ+fckW2PywEG3sNw54Ow@mail.gmail.com Backpatch: 9.5, where pg_rewind was introduced	2016-03-27 23:46:25 +02:00
Andres Freund	9f7c527af3	Fix LWLockReportWaitEnd() parameter list to be (void). Previously it was an "old style" function declaration.	2016-03-27 22:53:31 +02:00
Andres Freund	a6c845946d	pg_rewind: Close backup_label file descriptor. This was a relatively harmless leak, as createBackupLabel() is only called once per pg_rewind invocation. Author: Michael Paquier Reported-By: Michael Paquier Discussion: CAB7nPqRnOw30gOXe2_SPLjh37bgm4V+txbYAPwoXb97nGQ297w@mail.gmail.com Backpatch: 9.5, where pg_rewind was introduced	2016-03-27 22:48:31 +02:00
Andres Freund	1a7a43672b	Don't use !! but != 0/NULL to force boolean evaluation. I introduced several uses of !! to force bit arithmetic to be boolean, but per discussion the project prefers != 0/NULL. Discussion: CA+TgmoZP5KakLGP6B4vUjgMBUW0woq_dJYi0paOz-My0Hwt_vQ@mail.gmail.com	2016-03-27 18:10:19 +02:00
Andres Freund	af4472bcb8	Change various GinIs macros to return 0/1. Returning the direct result of bit arithmetic, in a macro intended to be used in a boolean manner, can be problematic if the return value is stored in a variable of type 'bool'. If bool is implemented using C99's _Bool, that can lead to comparison failures if the variable is then compared again with the expression (see ginStepRight() for an example that fails), as _Bool forces the result to be 0/1. That happens in some configurations of newer MSVC compilers. It's also problematic when storing the result of such an expression in a narrower type. Several gin macros have been declared in that style since gin's initial commit in `8a3631f8d8`. There's a lot more macros like this, but this is the only one causing regression test failures; and I don't want to commit and backpatch a larger patch with lots of conflicts just before the next set of minor releases. Discussion: 20150811154237.GD17575@awork2.anarazel.de Backpatch: All supported branches	2016-03-27 17:46:48 +02:00
Tom Lane	221619ad69	Modernize zic's test for valid timezone abbreviations. We really need to sync all of our IANA-derived timezone code with upstream, but that's going to be a large patch and I certainly don't care to shove such a thing into stable branches immediately before a release. As a stopgap, copy just the tzcode2016c logic that checks validity of timezone abbreviations. This prevents getting multiple "time zone abbreviation differs from POSIX standard" bleats with tzdata 2014b and later.	2016-03-26 15:58:44 -04:00
Tom Lane	76281aa964	Avoid a couple of zero-divide scenarios in the planner. cost_subplan() supposed that the given subplan must have plan_rows > 0, which as far as I can tell was true until recent refactoring of the code in createplan.c; but now that code allows the Result for a provably empty subquery to have plan_rows = 0. Rather than undo that change, put in a clamp to prevent zero divide. get_cheapest_fractional_path() likewise supposed that best_path->rows > 0. This assumption has been wrong for longer. It's actually harmless given IEEE float math, because a positive value divided by zero gives +Infinity and compare_fractional_path_costs() will do the right thing with that. Still, best not to assume that. final_cost_nestloop() also seems to have some risks in this area, so borrow the clamping logic already present in the mergejoin cost functions. Lastly, remove unnecessary clamp_row_est() in planner.c's calls to get_number_of_groups(). The only thing that function does with path_rows is pass it to estimate_num_groups() which already has an internal clamp, so we don't need the extra call; and if we did, the callers are arguably the wrong place for it anyway. First two items reported by Piotr Stefaniak, the others are products of my nosing around for similar problems. No back-patch since there's no evidence that problems arise in the back branches.	2016-03-26 12:03:12 -04:00
Tom Lane	676265eb7b	Update time zone data files to tzdata release 2016c. DST law changes in Azerbaijan, Chile, Haiti, Palestine, and Russia (Altai, Astrakhan, Kirov, Sakhalin, Ulyanovsk regions). Historical corrections for Lithuania, Moldova, Russia (Kaliningrad, Samara, Volgograd). As of 2015b, the keepers of the IANA timezone database started to use numeric time zone abbreviations (e.g., "+04") instead of inventing abbreviations not found in the wild like "ASTT". This causes our rather old copy of zic to whine "warning: time zone abbreviation differs from POSIX standard" several times during "make install". This warning is harmless according to the IANA folk, and I don't see any problems with these abbreviations in some simple tests; but it seems like now would be a good time to update our copy of the tzcode stuff. I'll look into that soon.	2016-03-25 19:03:08 -04:00
Tom Lane	9f73a2f6d1	Fix PL/Tcl for vpath builds. Commit `cd37bb7859` works for in-tree builds, but not so much for VPATH. Per buildfarm.	2016-03-25 17:13:03 -04:00
Tom Lane	cd37bb7859	Improve PL/Tcl errorCode facility by providing decoded name for SQLSTATE. We don't really want to encourage people to write numeric SQLSTATEs in programs; that's unreadable and error-prone. Copy plpgsql's infrastructure for converting between SQLSTATEs and exception names shown in Appendix A, and modify examples in tests and documentation to do it that way.	2016-03-25 16:54:52 -04:00
Tom Lane	fb8d2a7f57	In PL/Tcl, make database errors return additional info in the errorCode. Tcl has a convention for returning additional info about an error in a global variable named errorCode. Up to now PL/Tcl has ignored that, but this patch causes database errors caught by PL/Tcl to fill in errorCode with useful information from the ErrorData struct. Jim Nasby, reviewed by Pavel Stehule and myself	2016-03-25 15:52:53 -04:00
Tom Lane	c94959d411	Fix DROP OPERATOR to reset oprcom/oprnegate links to the dropped operator. This avoids leaving dangling links in pg_operator; which while fairly harmless are also unsightly. While we're at it, simplify OperatorUpd, which went through heap_modify_tuple for no very good reason considering it had already made a tuple copy it could just scribble on. Roma Sokolov, reviewed by Tomas Vondra, additional hacking by Robert Haas and myself.	2016-03-25 12:33:16 -04:00
Tom Lane	d543170f2f	Don't split up SRFs when choosing to postpone SELECT output expressions. In commit `9118d03a8c` we taught the planner to postpone evaluation of set-returning functions in a SELECT's targetlist until after any sort done to satisfy ORDER BY. However, if we postpone some SRFs this way while others do not get postponed (because they're sort or group key columns) we will break the traditional behavior by which all SRFs in the tlist run in-step during ExecTargetList(), so that you get the least common multiple of their periods not the product. Fix make_sort_input_target() so it will not split up SRF evaluation in such cases. There is still a hazard of similar odd behavior if there's a SRF in a grouping column and another one that isn't, but that was true before and we're just trying to preserve bug-compatibility with the traditional behavior. This whole area is overdue to be rethought and reimplemented, but we'll try to avoid changing behavior until then. Per report from Regina Obe.	2016-03-25 11:19:51 -04:00
Tom Lane	7caaeaf360	Link libpq after libpgfeutils to satisfy Windows linker. Some of the non-MSVC Windows buildfarm members seem to need this to avoid getting "undefined symbol" errors on libpgfeutils' references to libpq. I could understand that if libpq were a static library, but surely it is not? Oh well, at least the extra reference is no more harmful than it is for libpgcommon or libpgport.	2016-03-24 20:45:31 -04:00
Tom Lane	c1156411ad	Move psql's psqlscan.l into src/fe_utils. This completes (at least for now) the project of getting rid of ad-hoc linkages among the src/bin/ subdirectories. Everything they share is now in src/fe_utils/ and is included from a static library at link time. A side benefit is that we can restore the FLEX_NO_BACKUP check for psqlscanslash.l. We might need to think of another way to do that check if we ever need to build two lexers with that property in the same source directory, but there's no foreseeable reason to need that.	2016-03-24 20:28:47 -04:00
Tom Lane	d65bea26a8	Move psql's print.c and mbprint.c into src/fe_utils. Just turning the crank ...	2016-03-24 18:27:28 -04:00
Tom Lane	a376960c8f	Suppress compiler warning for get_am_type_string(). Compilers that don't know that elog(ERROR) doesn't return complained that this function might fail to return a value. Per buildfarm. While at it, const-ify the function's declaration, since the intent is evidently to always return a constant string.	2016-03-24 17:22:24 -04:00
Tom Lane	0ecd3fedfc	Add missed inclusion requirement in Mkvcbuild.pm. Per buildfarm.	2016-03-24 17:12:40 -04:00
Tom Lane	588d963b00	Create src/fe_utils/, and move stuff into there from pg_dump's dumputils. Per discussion, we want to create a static library and put the stuff into it that until now has been shared across src/bin/ directories by ad-hoc methods like symlinking a source file. This commit creates the library and populates it with a couple of files that contain the widely-useful portions of pg_dump's dumputils.c file. dumputils.c survives, because it has some stuff that didn't seem appropriate for fe_utils, but it's significantly smaller and is no longer referenced from any other directory. Follow-on patches will move more stuff into fe_utils. The Mkvcbuild.pm hacking here is just a best guess; we'll see how the buildfarm likes it.	2016-03-24 15:55:57 -04:00
Robert Haas	59a02815e2	Use correct GetDatum function. Oops.	2016-03-24 08:57:48 -04:00
Tom Lane	c2d1eea9e7	Avoid PGDLLIMPORT for simple local references in frontend programs. I was wondering if this would be an issue, and buildfarm member frogmouth says it is.	2016-03-23 23:26:44 -04:00
Alvaro Herrera	473b932870	Support CREATE ACCESS METHOD This enables external code to create access methods. This is useful so that extensions can add their own access methods which can be formally tracked for dependencies, so that DROP operates correctly. Also, having explicit support makes pg_dump work correctly. Currently only index AMs are supported, but we expect different types to be added in the future. Authors: Alexander Korotkov, Petr Jelínek Reviewed-By: Teodor Sigaev, Petr Jelínek, Jim Nasby Commitfest-URL: https://commitfest.postgresql.org/9/353/ Discussion: https://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com	2016-03-23 23:01:35 -03:00
Tom Lane	2c6af4f442	Move keywords.c/kwlookup.c into src/common/. Now that we have src/common/ for code shared between frontend and backend, we can get rid of (most of) the klugy ways that the keyword table and keyword lookup code were formerly shared between different uses. This is a first step towards a more general plan of getting rid of special-purpose kluges for sharing code in src/bin/. I chose to merge kwlookup.c back into keywords.c, as it once was, and always has been so far as keywords.h is concerned. We could have kept them separate, but there is noplace that uses ScanKeywordLookup without also wanting access to the backend's keyword list, so there seems little point. ecpg is still a bit weird, but at least now the trickiness is documented. I think that the MSVC build script should require no adjustments beyond what's done here ... but we'll soon find out.	2016-03-23 20:22:08 -04:00
Robert Haas	3df9c374e2	Disable abbreviated keys for string-sorting in non-C locales. Unfortunately, every version of glibc thus far tested has bugs whereby strcoll() ordering does not match strxfrm() ordering as required by the standard. This can result in, for example, corrupted indexes. Disabling abbreviated keys in these cases slows down non-C-collation string sorting considerably, but there seems to be no practical alternative. Users who are confident that their libc implementations are solid in this regard can re-enable the optimization by compiling with TRUST_STRXFRM. Users who have built indexes using PostgreSQL 9.5 or PostgreSQL 9.5.1 should REINDEX if there is a possibility that they may have been affected by this problem. Report by Marc-Olaf Jaschke. Investigation mostly by Tom Lane, with help from Peter Geoghegan, Noah Misch, Stephen Frost, and me. Patch by me, reviewed by Peter Geoghegan and Tom Lane.	2016-03-23 16:03:13 -04:00
Robert Haas	44ca4022f3	Partition the freelist for shared dynahash tables. Without this, contention on the freelist can become a pretty serious problem on large servers. Aleksander Alekseev, reviewed by Anastasia Lubennikova, Dilip Kumar, and me.	2016-03-23 11:00:54 -04:00
Tom Lane	ea4b8bd618	Code review for error reports in jsonb_set(). User-facing (even tested by regression tests) error conditions were thrown with elog(), hence had wrong SQLSTATE and were untranslatable. And the error message texts weren't up to project style, either.	2016-03-23 11:00:39 -04:00
Tom Lane	384dfbde19	Fix unsafe use of strtol() on a non-null-terminated Text datum. jsonb_set() could produce wrong answers or incorrect error reports, or in the worst case even crash, when trying to convert a path-array element into an integer for use as an array subscript. Per report from Vitaly Burovoy. Back-patch to 9.5 where the faulty code was introduced (in commit `c6947010ce`). Michael Paquier	2016-03-23 10:43:13 -04:00
Simon Riggs	8320c625d9	Change comment to describe correct lock level used	2016-03-23 11:32:34 +00:00
Tom Lane	71404af2a2	Fix EvalPlanQual bug when query contains both locked and not-locked rels. In commit `afb9249d06`, we (probably I) made ExecLockRows assign null test tuples to all relations of the query while setting up to do an EvalPlanQual recheck for a newly-updated locked row. This was sheerest brain fade: we should only set test tuples for relations that are lockable by the LockRows node, and in particular empty test tuples are only sensible for inheritance child relations that weren't the source of the current tuple from their inheritance tree. Setting a null test tuple for an unrelated table causes it to return NULLs when it should not, as exhibited in bug #14034 from Bronislav Houdek. To add insult to injury, doing it the wrong way required two loops where one would suffice; so the corrected code is even a bit shorter and faster. Add a regression test case based on his example, and back-patch to 9.5 where the bug was introduced.	2016-03-22 17:56:20 -04:00
Tom Lane	b283096534	Allow the delay in psql's \watch command to be a fractional second. Instead of just "2" seconds, allow eg. "2.5" seconds. Per request from Alvaro Herrera. No docs change since the docs didn't say you couldn't do this already.	2016-03-21 18:34:18 -04:00
Tom Lane	dea2b5960a	Improve header output from psql's \watch command. Include the \pset title string if there is one, and shorten the prefab part of the header to be "timestamp (every Ns)". Per suggestion by David Johnston. Michael Paquier and Tom Lane	2016-03-21 18:18:13 -04:00
Robert Haas	ae507d9222	Make max_parallel_degree PGC_USERSET. It was intended to be this way all along, just like other planner GUCs such as work_mem. But I goofed.	2016-03-21 10:54:36 -04:00
Robert Haas	e06a38965b	Support parallel aggregation. Parallel workers can now partially aggregate the data and pass the transition values back to the leader, which can combine the partial results to produce the final answer. David Rowley, based on earlier work by Haribabu Kommi. Reviewed by Álvaro Herrera, Tomas Vondra, Amit Kapila, James Sewell, and me.	2016-03-21 09:30:18 -04:00
Andres Freund	7fa0064092	Properly declare FeBeWaitSet. Surprising that this worked on a number of systems. Reported by buildfarm member longfin.	2016-03-21 12:58:18 +01:00
Andres Freund	98a64d0bd7	Introduce WaitEventSet API. Commit `ac1d794` ("Make idle backends exit if the postmaster dies.") introduced a regression on, at least, large linux systems. Constantly adding the same postmaster_alive_fds to the OSs internal datastructures for implementing poll/select can cause significant contention; leading to a performance regression of nearly 3x in one example. This can be avoided by using e.g. linux' epoll, which avoids having to add/remove file descriptors to the wait datastructures at a high rate. Unfortunately the current latch interface makes it hard to allocate any persistent per-backend resources. Replace, with a backward compatibility layer, WaitLatchOrSocket with a new WaitEventSet API. Users can allocate such a Set across multiple calls, and add more than one file-descriptor to wait on. The latter has been added because there's upcoming postgres features where that will be helpful. In addition to the previously existing poll(2), select(2), WaitForMultipleObjects() implementations also provide an epoll_wait(2) based implementation to address the aforementioned performance problem. Epoll is only available on linux, but that is the most likely OS for machines large enough (four sockets) to reproduce the problem. To actually address the aforementioned regression, create and use a long-lived WaitEventSet for FE/BE communication. There are additional places that would benefit from a long-lived set, but that's a task for another day. Thanks to Amit Kapila, who helped make the windows code I blindly wrote actually work. Reported-By: Dmitry Vasilyev Discussion: CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com 20160114143931.GG10941@awork2.anarazel.de	2016-03-21 12:22:54 +01:00
Andres Freund	72e2d21c12	Combine win32 and unix latch implementations. Previously latches for windows and unix had been implemented in different files. A later patch introduce an expanded wait infrastructure, keeping the implementation separate would introduce too much duplication. This basically just moves the functions, without too much change. The reason to keep this separate is that it allows blame to continue working a little less badly; and to make review a tiny bit easier. Discussion: 20160114143931.GG10941@awork2.anarazel.de	2016-03-21 11:03:26 +01:00
Andres Freund	326d73c86f	Second attempt at fixing MSVC build for `68ab8e8ba4`. After the previous fix in `6f1f34c9` msvc ended up looking for psqlscan.c in the wrong directory. David's fix just forces the path to be adjusted. That's not a particularly pretty fix, but it hopefully will make the buildfarm green again. Author: David Rowley Discussion: CAKJS1f_9CCi_t+LEgV5GWoCj3wjavcMoDc5qfcf_A0UwpQoPoA@mail.gmail.com	2016-03-21 10:49:45 +01:00
Tom Lane	b6afae71aa	Use %option bison-bridge in psql/pgbench lexers. The point of this change is to use %pure-parser in pgbench's exprparse.y. The immediate reason is that it turns out very ancient versions of bison have a bug with the combination of a reentrant lexer and non-reentrant parser. We could consider dropping support for such ancient bisons; but considering that we might well need exprparse.y to be reentrant some day, it seems better to make it so right now than to move the portability goalposts. (AFAICT there's no particular performance consequence to this change, either, so there's no good reason not to do it.) Now, %pure-parser assumes that the called lexer is built with %option bison-bridge. Because we're assuming bitwise compatibility of yyscan_t (yyguts_t) data structures among all the psql/pgbench lexers, that requirement propagates back to psql's lexers as well. But it's just a few lines of change on that side too; and if psqlscan.l is to set the baseline for a possibly-large family of lexers, it should err on the side of including not omitting useful features.	2016-03-20 21:59:03 -04:00
Tom Lane	6f1f34c92b	Best-guess attempt at fixing MSVC build for `68ab8e8ba4`. pgbench now needs to use src/bin/psql/psqlscan.l, but it's not very clear how to fit that into the MSVC build system. If this doesn't work I'm going to need some help from somebody who actually understands those scripts ...	2016-03-20 17:51:54 -04:00
Tom Lane	68ab8e8ba4	SQL commands in pgbench scripts are now ended by semicolons, not newlines. To allow multiline SQL commands in scripts, adopt the same rules psql uses to decide what is the end of a SQL command, to wit, an unquoted semicolon not encased in parentheses. Do this by importing the same flex lexer that psql uses, since coping with stuff like dollar-quoted literals is hard to get right without going the full nine yards. This makes use of the infrastructure added in commit `0ea9efbe9e` to support independently-written flex lexers scanning the same PsqlScanState input-buffer data structure. Since that infrastructure isn't very friendly to ad-hoc parsing code such as strtok(), improve exprscan.l so that it can parse either whitespace-separated words or expression tokens, on demand, and rewrite pgbench.c's backslash-command parsing code to always use the lexer to fetch tokens. It's still the case that pgbench backslash commands extend to the end of the line, no more and no less. That could be changed in a fairly localized way now, and there was some interest in doing so, but it seems like material for a separate patch. In passing, make some marginal cleanups in syntax error reporting, const-ify a few data structures that could use it, and run some of this code through pgindent. I can't tell whether the MSVC build scripts need to be taught explicitly about the changes here or not, but the buildfarm will soon tell us. Kyotaro Horiguchi and Tom Lane	2016-03-20 12:58:51 -04:00
Andrew Dunstan	5d03201056	Remove dependency on psed for MSVC builds. Modern Perl has removed psed from its core distribution, so it might not be readily available on some build platforms. We therefore replace its use with a Perl script generated by s2p, which is equivalent to the sed script. The latter is retained for non-MSVC builds to avoid creating a new hard dependency on Perl for non-Windows tarball builds. Backpatch to all live branches. Michael Paquier and me.	2016-03-19 18:36:35 -04:00
Tom Lane	d5351fcb03	Fix phony .PHONY. A couple makefiles had misspelled the magic .PHONY target as PHONY.	2016-03-19 17:19:37 -04:00
Tom Lane	429ee5a822	Make pgbench's expression lexer reentrant. This is a necessary preliminary step for making it play with psqlscan.l given the way I set up the lexer input-buffer sharing mechanism in commit `0ea9efbe9e`. I've not tried to make it actually reentrant; there's still some static variables laying about. But flex thinks it's reentrant, and that's what counts. In support of that, fix exprparse.y to pass through the yyscan_t from the caller. Also do some minor code beautification, like not casting away const.	2016-03-19 16:35:41 -04:00
Alvaro Herrera	1038bc91ca	pgbench: Silence new compiler warnings The original coding in `7bafffea64` and previous wasn't all that great anyway. Reported by Jeff Janes and Tom Lane	2016-03-19 16:16:39 -03:00
Tom Lane	78e7c44399	Typo fix.	2016-03-19 14:36:52 -04:00
Tom Lane	21c8ee7946	Sync backend/parser/scan.l with bin/psql/psqlscan.l. Make some minor formatting adjustments to make it easier to diff these files and see that they indeed implement the same flex rules (at least to the extent that we want them to be the same). (Someday it'd be nice to make ecpg's pgc.l more easily diff'able too, but today is not that day.) Also run relevant parts of these files and psqlscanslash.l through pgindent. No actual behavioral changes here, just obsessive neatnik-ism.	2016-03-19 14:36:22 -04:00
Tom Lane	72b1e3a21f	Build backend/parser/scan.l and interfaces/ecpg/preproc/pgc.l standalone. Now that we know about the %top{} trick, we can revert to building flex lexers as separate .o files. This is worth doing for a couple of reasons besides sheer cleanliness. We can narrow the scope of the -Wno-error flag that's forced on scan.c. Also, since these grammar and lexer files are so large, splitting them into separate build targets should have some advantages in build speed, particularly in parallel or ccache'd builds. We have quite a few other .l files that could be changed likewise, but the above arguments don't apply to them, so the benefit of fixing them seems pretty minimal. Leave the rest for some other day.	2016-03-19 12:07:24 -04:00
Alvaro Herrera	7bafffea64	pgbench: Allow changing weights for scripts Previously, all scripts had the same probability of being chosen when multiple of them were specified via -b, -f, -N, -S. With this commit, -b and -f now search for an "@" in the script name and use the integer found after it as the drawing probability for that script. (One disadvantage is that if you have script whose names contain @, you are now forced to specify "@1" at the end; otherwise the name's @ is confused with a weight separator. We don't expect many pgbench script with @ in their names in the wild, so this shouldn't be too serious a problem.) While at it, rework the interface between addScript, process_file, process_builtin, and findBuiltin. It had gotten a bit out of hand with recent commits. Author: Fabien Coelho Reviewed-By: Andres Freund, Robert Haas, Álvaro Herrera, Michaël Paquier Discussion: http://www.postgresql.org/message-id/alpine.DEB.2.10.1603160721240.1666@sto	2016-03-19 12:32:42 -03:00
Tom Lane	b46d9beb65	With ancient gcc, skip pg_attribute_printf() on function pointer. Buildfarm results show that the ability to attach pg_attribute_printf decoration to a function pointer appeared somewhere between gcc 2.95.3 and gcc 4.0.1. Guess that it was there in 4.0.	2016-03-19 10:59:20 -04:00
Peter Eisentraut	9a83564c58	Allow SSL server key file to have group read access if owned by root We used to require the server key file to have permissions 0600 or less for best security. But some systems (such as Debian) have certificate and key files managed by the operating system that can be shared with other services. In those cases, the "postgres" user is made a member of a special group that has access to those files, and the server key file has permissions 0640. To accommodate that kind of setup, also allow the key file to have permissions 0640 but only if owned by root. From: Christoph Berg <myon@debian.org> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2016-03-19 11:03:22 +01:00
Andres Freund	6eb2be15b5	Fix stupid omission in `c4901a1e`. Reported-By: Jeff Janes Discussion: CAMkU=1zGxREwoyaCrp_CHadEB+dPgpVyKBysCJ+6xP9gCOvAuw@mail.gmail.com	2016-03-18 22:37:59 -07:00
Tom Lane	07aed46a6b	Fix missed update in _readForeignScan(). Blatant fail in `0bf3ae88af`. Caught by buildfarm member mandrill.	2016-03-19 01:20:34 -04:00
Tom Lane	ff0a7e6167	Use yylex_init not yylex_init_extra(). Older versions of flex don't have the latter. Per buildfarm.	2016-03-19 01:02:18 -04:00
Tom Lane	a3e39f8363	Suppress FLEX_NO_BACKUP check for psqlscanslash.l. The existing infrastructure for FLEX_NO_BACKUP doesn't work reliably when two lexers are built in parallel in the same directory. We can probably fix that, but as a short-term workaround, just don't make the check for psqlscanslash.l. Per buildfarm.	2016-03-19 00:43:46 -04:00
Tom Lane	0ea9efbe9e	Split psql's lexer into two separate .l files for SQL and backslash cases. This gets us to a point where psqlscan.l can be used by other frontend programs for the same purpose psql uses it for, ie to detect when it's collected a complete SQL command from input that is divided across line boundaries. Moreover, other programs can supply their own lexers for backslash commands of their own choosing. A follow-on patch will use this in pgbench. The end result here is roughly the same as in Kyotaro Horiguchi's 0001-Make-SQL-parser-part-of-psqlscan-independent-from-ps.patch, although the details of the method for switching between lexers are quite different. Basically, in this patch we share the entire PsqlScanState, YY_BUFFER_STATE stack, and yyscan_t between different lexers. The only thing we need to do to switch to a different lexer is to make sure the start_state is valid for the new lexer. This works because flex doesn't keep any other persistent state that depends on the specific lexing tables generated for a particular .l file. (We are assuming that both lexers are built with the same flex version, or at least versions that are compatible with respect to the contents of yyscan_t; but that doesn't seem likely to be a big problem in practice, considering how slowly flex changes.) Aside from being more efficient than Horiguchi-san's original solution, this avoids possible corner-case changes in semantics: the original code was capable of popping the input buffer stack while still staying in backslash-related parsing states. I'm not sure that that equates to any useful user-visible behaviors, but I'm not sure it doesn't either, so I'm loath to assume that we only need to consider the topmost buffer when parsing a backslash command. I've attempted to update the MSVC build scripts for the added .l file, but will rely on the buildfarm to see if I missed anything. Kyotaro Horiguchi and Tom Lane	2016-03-19 00:24:55 -04:00
Tom Lane	27199058d9	Convert psql's flex lexer to be re-entrant, and make it compile standalone. Change psqlscan.l to specify '%option reentrant', adjust internal APIs to match, and get rid of its internal static variables. While this is good cleanup in an abstract sense, the reason to do it right now is that it seems the only practical way to support use of separate flex lexers with common PsqlScanState infrastructure. If we build two non-reentrant lexers then we are going to have problems with dangling buffer pointers in whichever lexer isn't active when we transition from one buffer to another, as well as curious side-effects if we try to share any code between the files. (Horiguchi-san had a different solution to that in his pending patch, but I find it ugly and probably broken for corner cases.) Depending on which version of flex you're using, this may result in getting a "warning: unused variable 'yyg'" warning from psqlscan, similar to the one you'd have seen for a long time in backend/parser/scan.l. I put a local -Wno-error into CFLAGS for the file, for the convenience of those who compile with -Werror. Also, stop compiling psqlscan as part of mainloop.c, and make it a standalone build target instead. This is a lot cleaner than before, though it doesn't really change much in practice as of this commit. (I'm not sure whether the MSVC build scripts will need some help with this part, but the buildfarm will soon tell us.)	2016-03-18 21:22:02 -04:00
Peter Eisentraut	b555ed8102	Merge wal_level "archive" and "hot_standby" into new name "replica" The distinction between "archive" and "hot_standby" existed only because at the time "hot_standby" was added, there was some uncertainty about stability. This is now a long time ago. We would like to move forward with simplifying the replication configuration, but this distinction is in the way, because a primary server cannot tell (without asking a standby or predicting the future) which one of these would be the appropriate level. Pick a new name for the combined setting to make it clearer that it covers all (non-logical) backup and replication uses. The old values are still accepted but are converted internally. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: David Steele <david@pgmasters.net>	2016-03-18 23:56:03 +01:00
Tom Lane	4e1d2a1708	Decouple psqlscan.l from surrounding program. Remove assorted external references from psqlscan.l in preparation for making it usable by other frontend programs. This mostly involves getting rid of direct calls to psql_error() and GetVariable() in favor of introducing a callback-functions struct to encapsulate variable fetching and error printing. In addition, pass the current encoding and standard-strings status as additional parameters to psql_scan_setup instead of looking directly at "pset" or calling additional functions. I did not bother to change some references to psql_error that are in functions that will soon migrate to a psql-specific backslash-command lexer. Other than that, this version of psqlscan.l is capable of compiling standalone. It still depends on assorted src/common functions as well as some encoding-related libpq functions, but we expect that all programs using it will be happy with those dependencies. Kyotaro Horiguchi, somewhat editorialized on by me	2016-03-18 15:05:59 -04:00
Robert Haas	08a6d36dcb	Use INT64_FORMAT instead of %ld for int64. Commit `0011c0091e` introduced this mistake. Patch by me. Reported by Andres Freund, who also reviewed the patch.	2016-03-18 14:54:09 -04:00
Andres Freund	c4901a1e03	Only clear latch self-pipe/event if there is a pending notification. This avoids a good number of, individually quite fast, system calls in scenarios with many quick queries. Besides the aesthetic benefit of seing fewer superflous system calls with strace, it also improves performance by ~2% measured by pgbench -M prepared -c 96 -j 8 -S (scale 100). Without having benchmarked it, this patch also adjust the windows code, as that makes it easier to unify the unix/windows codepaths in a later patch. There's little reason to diverge in behaviour between the platforms. Discussion: CA+TgmoYc1Zm+Szoc_Qbzi92z2c1vRHZmjhfPn5uC=w8bXv6Avg@mail.gmail.com Reviewed-By: Robert Haas	2016-03-18 11:47:05 -07:00
Andres Freund	c17966201c	Make it easier to choose the used waiting primitive in unix_latch.c. This allows for easier testing of the different primitives; in preparation for adding a new primitive. Discussion: 20160114143931.GG10941@awork2.anarazel.de Reviewed-By: Robert Haas	2016-03-18 11:46:54 -07:00
Andres Freund	6bc4d95fcc	Error out if waiting on socket readiness without a specified socket. Previously we just ignored such an attempt, but that seems to serve no purpose but making things harder to debug. Discussion: 20160114143931.GG10941@awork2.anarazel.de 20151230173734.hx7jj2fnwyljfqek@alap3.anarazel.de Reviewed-By: Robert Haas	2016-03-18 11:46:45 -07:00
Andres Freund	fad0f9d8c9	Remove unused, and dangerous, TestLatch() macro. The macro has not seen any in-tree use since latches had been introduced in `2746e5f`, in 2010.	2016-03-18 11:46:42 -07:00
Robert Haas	0bf3ae88af	Directly modify foreign tables. postgres_fdw can now sent an UPDATE or DELETE statement directly to the foreign server in simple cases, rather than sending a SELECT FOR UPDATE statement and then updating or deleting rows one-by-one. Etsuro Fujita, reviewed by Rushabh Lathia, Shigeru Hanada, Kyotaro Horiguchi, Albe Laurenz, Thom Brown, and me.	2016-03-18 13:55:52 -04:00
Tom Lane	3422feccca	Clean up some misplaced #includes. Random .h files have no business including postgres-fe.h (or postgres.h). If that wasn't the first #include done by the calling .c file, it's the .c file that's broken. Noted while prepping Kyotaro Horiguchi's psql lexer refactoring patch.	2016-03-18 13:43:17 -04:00
Teodor Sigaev	3187d6de0e	Introduce parse_ident() SQL-layer function to split qualified identifier into array parts. Author: Pavel Stehule with minor editorization by me and Jim Nasby	2016-03-18 18:16:14 +03:00
Robert Haas	992b5ba30d	Push scan/join target list beneath Gather when possible. This means that, for example, "SELECT expensive_func(a) FROM bigtab WHERE something" can compute expensive_func(a) in the workers rather than the leader if it happens to be parallel-safe, which figures to be a big win in some practical cases. Currently, we can only do this if the entire target list is parallel-safe. If we worked harder, we might be able to evaluate parallel-safe targets in the worker and any parallel-restricted targets in the leader, but that would be more complicated, and there aren't that many parallel-restricted functions that people are likely to use in queries anyway. I think. So just do the simple thing for the moment. Robert Haas, Amit Kapila, and Tom Lane	2016-03-18 09:50:05 -04:00
Robert Haas	2d8a1e22b1	Various minor corrections of and improvements to comments. Aleksander Alekseev	2016-03-18 09:38:59 -04:00
Tom Lane	bd0ab28912	Remove useless double calls of make_parsestate(). Aleksander Alekseev	2016-03-17 16:46:35 -04:00
Robert Haas	c27033ff7c	Update tuplesort.c comments for memory mangement improvements. I'm committing these changes separately so that it's clear what is Peter's original work versus what I changed. This is a followup to commit `0011c0091e`, and these changes are all by me.	2016-03-17 16:11:14 -04:00
Robert Haas	0011c0091e	Improve memory management for external sorts. Introduce a new memory context which stores tuple data, and reset it at the end of each merge pass; this helps avoid memory fragmentation and, consequently, overallocation. Also, for the final merge patch, eliminate memory context chunk header overhead entirely by allocating all of the memory used for buffering tuples during the merge in a single chunk. Since this modestly increases the number of tuples we can store, grow the memtuples array a bit so that we're less likely to run short of slots there. Peter Geoghegan. Review and testing of patches in this series by Jeff Janes, Greg Stark, Mithun Cy, and me.	2016-03-17 16:10:41 -04:00
Tom Lane	55c3a04d60	Fix assorted breakage in to_char()'s OF format option. In HEAD, fix incorrect field width for hours part of OF when tm_gmtoff is negative. This was introduced by commit `2d87eedc1d` as a result of falsely applying a pattern that's correct when + signs are omitted, which is not the case for OF. In 9.4, fix missing abs() call that allowed a sign to be attached to the minutes part of OF. This was fixed in 9.5 by `9b43d73b3f`, but for inscrutable reasons not back-patched. In all three versions, ensure that the sign of tm_gmtoff is correctly reported even when the GMT offset is less than 1 hour. Add regression tests, which evidently we desperately need here. Thomas Munro and Tom Lane, per report from David Fetter	2016-03-17 15:50:33 -04:00
Teodor Sigaev	f4ceed6ceb	Improve support of Hunspell - allow to use non-ascii characters as affix flag. Non-numeric affix flags now are stored as string instead of numeric value of character. - allow to use 0 as affix flag in numeric encoded affixes That adds support for arabian, hungarian, turkish and brazilian portuguese languages. Author: Artur Zakirov with heavy editorization by me	2016-03-17 17:23:38 +03:00
Robert Haas	0218e8b3fa	Fix typos. Jim Nasby	2016-03-17 07:26:20 -04:00
Peter Eisentraut	fc201dfd95	Add syslog_split_messages parameter Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2016-03-16 23:21:44 -04:00
Peter Eisentraut	f4c454e9ba	Add syslog_sequence_numbers parameter Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2016-03-16 23:21:44 -04:00
Tom Lane	47211af17a	Fix "pg_bench -C -M prepared". This didn't work because when we dropped and re-established a database connection, we did not bother to reset session-specific state such as the statements-are-prepared flags. The st->prepared[] array certainly needs to be flushed, and I cleared a couple of other fields as well that couldn't possibly retain meaningful state for a new connection. In passing, fix some bogus comments and strange field order choices. Per report from Robins Tharakan.	2016-03-16 23:18:07 -04:00
Tom Lane	5db5146431	Fix j2day() to behave sanely for negative Julian dates. Somebody had apparently once figured that casting to unsigned int would produce the right output for negative inputs, but that would only be true if 2^32 were a multiple of 7, which of course it ain't. We need to use a signed division and then correct the sign of the remainder. AFAICT, the only case where this would arise currently is when doing ISO-week calculations for dates in 4714BC, where we'd compute a negative Julian date representing 4714-01-04BC and then do some arithmetic with it. Since we don't even really document support for such dates, this is not of much consequence. But we may as well get it right. Per report from Vitaly Burovoy.	2016-03-16 20:57:45 -04:00
Tom Lane	a70e13a39e	Be more careful about out-of-range dates and timestamps. Tighten the semantics of boundary-case timestamptz so that we allow timestamps >= '4714-11-24 00:00+00 BC' and < 'ENDYEAR-01-01 00:00+00 AD' exactly, no more and no less, but it is allowed to enter timestamps within that range using non-GMT timezone offsets (which could make the nominal date 4714-11-23 BC or ENDYEAR-01-01 AD). This eliminates dump/reload failure conditions for timestamps near the endpoints. To do this, separate checking of the inputs for date2j() from the final range check, and allow the Julian date code to handle a range slightly wider than the nominal range of the datatypes. Also add a bunch of checks to detect out-of-range dates and timestamps that formerly could be returned by operations such as date-plus-integer. All C-level functions that return date, timestamp, or timestamptz should now be proof against returning a value that doesn't pass IS_VALID_DATE() or IS_VALID_TIMESTAMP(). Vitaly Burovoy, reviewed by Anastasia Lubennikova, and substantially whacked around by me	2016-03-16 19:09:28 -04:00
Robert Haas	f2b74b01d4	Another comment update. I thought this was in my last commit, but I goofed.	2016-03-16 14:28:25 -04:00
Robert Haas	bc55cc0b6a	Fix problems in commit `c16dc1aca5`. Vinayak Pokale provided a patch for a copy-and-paste error in a comment. I noticed that I'd use the word "automatically" nearby where I meant to talk about things being "atomic". Rahila Syed spotted a misplaced counter update. Fix all that stuff.	2016-03-16 13:54:04 -04:00
Robert Haas	c6dda1f48e	Add idle_in_transaction_session_timeout. Vik Fearing, reviewed by Stéphane Schildknecht and me, and revised slightly by me.	2016-03-16 11:30:45 -04:00
Peter Eisentraut	f9e5ed61ed	UCS_to_EUC_JIS_2004.pl: Turn off "test" mode by default It produces debugging output files that are of no further use, so we don't need that by default.	2016-03-16 10:43:05 -04:00
Peter Eisentraut	9dbcb500ca	Make spacing and punctuation consistent	2016-03-16 10:43:05 -04:00
Robert Haas	3aff33aa68	Fix typos. Oskari Saarenmaa	2016-03-15 18:06:11 -04:00
Stephen Frost	fd658dbb30	Avoid incorrectly indicating exclusion constraint wait INSERT ... ON CONFLICT's precheck may have to wait on the outcome of another insertion, which may or may not itself be a speculative insertion. This wait is not necessarily associated with an exclusion constraint, but was always reported that way in log messages if the wait happened to involve a tuple that had no speculative token. Initially discovered through use of ON CONFLICT DO NOTHING, where spurious references to exclusion constraints in log messages were more likely. Patch by Peter Geoghegan. Reviewed by Julien Rouhaud. Back-patch to 9.5 where INSERT ... ON CONFLICT was added.	2016-03-15 18:04:39 -04:00
Alvaro Herrera	5bcc413f80	Fix typos in comments	2016-03-15 17:57:17 -03:00
Robert Haas	c16dc1aca5	Add simple VACUUM progress reporting. There's a lot more that could be done here yet - in particular, this reports only very coarse-grained information about the index vacuuming phase - but even as it stands, the new pg_stat_progress_vacuum can tell you quite a bit about what a long-running vacuum is actually doing. Amit Langote and Robert Haas, based on earlier work by Vinayak Pokale and Rahila Syed.	2016-03-15 13:32:56 -04:00
Tom Lane	0e9b89986b	Cope if platform declares mbstowcs_l(), but not locale_t, in <xlocale.h>. Previously, we included <xlocale.h> only if necessary to get the definition of type locale_t. According to notes in PGAC_TYPE_LOCALE_T, this is important because on some versions of glibc that file supplies an incompatible declaration of locale_t. (This info may be obsolete, because on my RHEL6 box that seems to be the only definition of locale_t; but there may still be glibc's in the wild for which it's a live concern.) It turns out though that on FreeBSD and maybe other BSDen, you can get locale_t from stdlib.h or locale.h but mbstowcs_l() and friends only from <xlocale.h>. This was leaving us compiling calls to mbstowcs_l() and friends with no visible prototype, which causes a warning and could possibly cause actual trouble, since it's not declared to return int. Hence, adjust the configure checks so that we'll include <xlocale.h> either if it's necessary to get type locale_t or if it's necessary to get a declaration of mbstowcs_l(). Report and patch by Aleksander Alekseev, somewhat whacked around by me. Back-patch to all supported branches, since we have been using mbstowcs_l() since 9.1.	2016-03-15 13:19:57 -04:00
Tom Lane	101fd9349e	Add a GetForeignUpperPaths callback function for FDWs. This is basically like the just-added create_upper_paths_hook, but control is funneled only to the FDW responsible for all the baserels of the current query; so providing such a callback is much less likely to add useless overhead than using the hook function is. The documentation is a bit sketchy. We'll likely want to improve it, and/or adjust the call conventions, when we get some experience with actually using this callback. Hopefully somebody will find time to experiment with it before 9.6 feature freeze.	2016-03-14 20:04:48 -04:00
Peter Eisentraut	be6de4c121	Add missing include for self-containment	2016-03-14 19:56:33 -04:00
Robert Haas	270b7daf5c	Fix EXPLAIN ANALYZE SELECT INTO not to choose a parallel plan. We don't support any parallel write operations at present, so choosing a parallel plan causes us to error out. Also, add a new regression test that uses EXPLAIN ANALYZE SELECT INTO; if we'd had this previously, force_parallel_mode testing would have caught this issue. Mithun Cy and Robert Haas	2016-03-14 19:48:46 -04:00
Tom Lane	5864d6a4b6	Provide a planner hook at a suitable place for creating upper-rel Paths. In the initial revision of the upper-planner pathification work, the only available way for an FDW or custom-scan provider to inject Paths representing post-scan-join processing was to insert them during scan-level GetForeignPaths or similar processing. While that's not impossible, it'd require quite a lot of duplicative processing to look forward and see if the extension would be capable of implementing the whole query. To improve matters for custom-scan providers, provide a hook function at the point where the core code is about to start filling in upperrel Paths. At this point Paths are available for the whole scan/join tree, which should reduce the amount of redundant effort considerably. (An alternative design that was suggested was to provide a separate hook for each post-scan-join processing step, but that seems messy and not clearly more useful.) Following our time-honored tradition, there's no documentation for this hook outside the source code. As-is, this hook is only meant for custom scan providers, which we can't assume very much about. A followon patch will implement an FDW callback to let FDWs do the same thing in a somewhat more structured fashion.	2016-03-14 19:23:29 -04:00
Tom Lane	28048cbaa2	Allow callers of create_foreignscan_path to specify nondefault PathTarget. Although the default choice of rel->reltarget should typically be sufficient for scan or join paths, it's not at all sufficient for the purposes PathTargets were invented for; in particular not for upper-relation Paths. So break API compatibility by adding a PathTarget argument to create_foreignscan_path(). To ease updating of existing code, accept a NULL value of the argument as selecting rel->reltarget.	2016-03-14 17:31:28 -04:00
Tom Lane	307c78852f	Rethink representation of PathTargets. In commit `19a541143a` I did not make PathTarget a subtype of Node, and embedded a RelOptInfo's reltarget directly into it rather than having a separately-allocated Node. In hindsight that was misguided micro-optimization, enabled by the fact that at that point we didn't have any Paths with custom PathTargets. Now that PathTarget processing has been fleshed out some more, it's easier to see that it's better to have PathTarget as an indepedent Node type, even if it does cost us one more palloc to create a RelOptInfo. So change it while we still can. This commit just changes the representation, without doing anything more interesting than that.	2016-03-14 16:59:59 -04:00
Tom Lane	07341a2980	Update PL/Perl's comment about hv_store(). Negative klen is documented since Perl 5.16, and 5.6 is no longer supported so no need to comment about it. Dagfinn Ilmari Mannsåker	2016-03-14 14:45:45 -04:00
Tom Lane	f3f3aae4b7	Improve conversions from uint64 to Perl types. Perl's integers are pointer-sized, so can hold more than INT_MAX on LP64 platforms, and come in both signed (IV) and unsigned (UV). Floating point values (NV) may also be larger than double. Since Perl 5.19.4 array indices are SSize_t instead of I32, so allow up to SSize_t_max on those versions. The limit is not imposed just by av_extend's argument type, but all the array handling code, so remove the speculative comment. Dagfinn Ilmari Mannsåker	2016-03-14 14:38:44 -04:00
Robert Haas	6be84eeb8d	Update more comments for `96198d94cb`. Etsuro Fujita, reviewed (though not completely endorsed) by Ashutosh Bapat, and slightly expanded by me.	2016-03-14 14:29:12 -04:00
Tom Lane	74a379b984	Use repalloc_huge() to enlarge a SPITupleTable's tuple pointer array. Commit `23a27b039d` widened the rows-stored counters to uint64, but that's academic unless we allow the tuple pointer array to exceed 1GB. (It might be a good idea to provide some other limit on how much storage a SPITupleTable can eat. On the other hand, there are plenty of other ways to drive a backend into swap hell.) Dagfinn Ilmari Mannsåker	2016-03-14 14:22:34 -04:00
Robert Haas	3adf9ced17	Improve check for overly-long extensible node name. The old code is bad for two reasons. First, it has an off-by-one error. Second, it won't help if you aren't running with assertions enabled. Per discussion, we want a check here in that case too. Author: KaiGai Kohei, adjusted by me. Reviewed-by: Petr Jelinek Discussion: 56E0D547.1030101@2ndquadrant.com	2016-03-14 13:52:52 -04:00
Tom Lane	2da7549987	pg_stat_get_progress_info() should be marked STRICT. I didn't bother with a catversion bump. Report and patch by Thomas Munro	2016-03-14 12:51:55 -04:00
Tom Lane	ab4ff2889d	Fix memory leak in repeated GIN index searches. Commit `d88976cfa1` removed this code from ginFreeScanKeys(): - if (entry->list) - pfree(entry->list); evidently in the belief that that ItemPointer array is allocated in the keyCtx and so would be reclaimed by the following MemoryContextReset. Unfortunately, it isn't and it won't. It'd likely be a good idea for that to become so, but as a simple and back-patchable fix in the meantime, restore this code to ginFreeScanKeys(). Also, add a similar pfree to where startScanEntry() is about to zero out entry->list. I am not sure if there are any code paths where this change prevents a leak today, but it seems like cheap future-proofing. In passing, make the initial allocation of so->entries[] use palloc not palloc0. The code doesn't depend on unused entries being zero; if it did, the array-enlargement code in ginFillScanEntry() would be wrong. So using palloc0 initially can only serve to confuse readers about what the invariant is. Per report from Felipe de Jesús Molina Bravo, via Jaime Casanova in <CAJGNTeMR1ndMU2Thpr8GPDUfiHTV7idELJRFusA5UXUGY1y-eA@mail.gmail.com>	2016-03-13 16:44:31 -04:00
Peter Eisentraut	96adb14d93	Fix whitespace and remove obsolete gitattributes entry	2016-03-13 16:03:13 -04:00
Magnus Hagander	a1aa8b7ea0	Fix order of MemSet arguments Noted by Tomas Vondra	2016-03-13 13:11:06 +01:00
Tom Lane	4b980167cb	Report memory context stats upon out-of-memory in repalloc[_huge]. This longstanding functionality evidently got lost in commit `3d6d1b5855`. Noted while studying an OOM report from Jaime Casanova. Backpatch to 9.5 where the bug was introduced.	2016-03-13 00:21:07 -05:00
Tom Lane	ab737f6ba9	Fix Windows portability issue in `23a27b039d`. _strtoui64() is available in MSVC builds, but apparently not with other Windows toolchains. Thanks to Petr Jelinek for the diagnosis.	2016-03-12 22:34:47 -05:00
Tom Lane	fc7a9dfddb	Get rid of scribbling on a const variable in psql's print.c. Commit `a2dabf0e1d` had the bright idea that it could modify a "const" global variable if it merely casted away const from a pointer. This does not work on platforms where the compiler puts "const" variables into read-only storage. Depressingly, we evidently have no such platforms in our buildfarm ... an oversight I have now remedied. (The one platform that is known to catch this is recent OS X with -fno-common.) Per report from Chris Ruprecht. Back-patch to 9.5 where the bogus code was introduced.	2016-03-12 18:16:24 -05:00
Tom Lane	23a27b039d	Widen query numbers-of-tuples-processed counters to uint64. This patch widens SPI_processed, EState's es_processed field, PortalData's portalPos field, FuncCallContext's call_cntr and max_calls fields, ExecutorRun's count argument, PortalRunFetch's result, and the max number of rows in a SPITupleTable to uint64, and deals with (I hope) all the ensuing fallout. Some of these values were declared uint32 before, and others "long". I also removed PortalData's posOverflow field, since that logic seems pretty useless given that portalPos is now always 64 bits. The user-visible results are that command tags for SELECT etc will correctly report tuple counts larger than 4G, as will plpgsql's GET GET DIAGNOSTICS ... ROW_COUNT command. Queries processing more tuples than that are still not exactly the norm, but they're becoming more common. Most values associated with FETCH/MOVE distances, such as PortalRun's count argument and the count argument of most SPI functions that have one, remain declared as "long". It's not clear whether it would be worth promoting those to int64; but it would definitely be a large dollop of additional API churn on top of this, and it would only help 32-bit platforms which seem relatively less likely to see any benefit. Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me	2016-03-12 16:05:29 -05:00
Andres Freund	e01157500f	Include portability/mem.h into fd.c for MAP_FAILED. Buildfarm members gaur and pademelon are old enough not to know about MAP_FAILED; which is used in `428b1d6`. Include portability/mem.h to fix; as already done in a bunch of other places.	2016-03-12 12:16:48 -08:00
Tom Lane	570be1f73f	Re-export a few of createplan.c's make_xxx() functions. CitusDB is using these and don't wish to redesign their code right now. I am not on board with this being a good idea, or a good precedent, but I lack the energy to fight about it.	2016-03-12 12:12:59 -05:00
Robert Haas	7087166a88	pg_upgrade: Convert old visibility map format to new format. Commit `a892234f83` added a second bit per page to the visibility map, but pg_upgrade has been unaware of it up until now. Therefore, a pg_upgrade from an earlier major release of PostgreSQL to any commit preceding this one and following the one mentioned above would result in invalid visibility map contents on the new cluster, very possibly leading to data corruption. This plugs that hole. Masahiko Sawada, reviewed by Jeff Janes, Bruce Momjian, Simon Riggs, Michael Paquier, Andres Freund, me, and others.	2016-03-11 12:34:20 -05:00
Tom Lane	9118d03a8c	When appropriate, postpone SELECT output expressions till after ORDER BY. It is frequently useful for volatile, set-returning, or expensive functions in a SELECT's targetlist to be postponed till after ORDER BY and LIMIT are done. Otherwise, the functions might be executed for every row of the table despite the presence of LIMIT, and/or be executed in an unexpected order. For example, in SELECT x, nextval('seq') FROM tab ORDER BY x LIMIT 10; it's probably desirable that the nextval() values are ordered the same as x, and that nextval() is not run more than 10 times. In the past, Postgres was inconsistent in this area: you would get the desirable behavior if the ordering were performed via an indexscan, but not if it had to be done by an explicit sort step. Getting the desired behavior reliably required contortions like SELECT x, nextval('seq') FROM (SELECT x FROM tab ORDER BY x) ss LIMIT 10; This patch conditionally postpones evaluation of pure-output target expressions (that is, those that are not used as DISTINCT, ORDER BY, or GROUP BY columns) so that they effectively occur after sorting, even if an explicit sort step is necessary. Volatile expressions and set-returning expressions are always postponed, so as to provide consistent semantics. Expensive expressions (costing more than 10 times typical operator cost, which by default would include any user-defined function) are postponed if there is a LIMIT or if there are expressions that must be postponed. We could be more aggressive and postpone any nontrivial expression, but there are costs associated with doing so: it requires an extra Result plan node which adds some overhead, and postponement changes the volume of data going through the sort step, perhaps for the worse. Since we tend not to have very good estimates of the output width of nontrivial expressions, it's hard to have much confidence in our ability to predict whether postponement would increase or decrease the cost of the sort; therefore this patch doesn't attempt to make decisions conditionally on that. Between these factors and a general desire not to change query behavior when there's not a demonstrable benefit, it seems best to be conservative about applying postponement. We might tweak the decision rules in the future, though. Konstantin Knizhnik, heavily rewritten by me	2016-03-11 12:27:50 -05:00
Teodor Sigaev	b1fdc727c3	Fix Windows build broken in `6943a946c7` Also it fixes dynamic array allocation disallowed by ANSI-C. Author: Stas Kelvich	2016-03-11 20:10:20 +03:00
Teodor Sigaev	8829af47ef	Fix merge affixes for numeric ones Some dictionaries have duplicated base words with different affix set, we just merge that sets into one set. But previously merging of sets of affixes was actually a concatenation of strings but it's wrong for numeric representation of affixes because such representation uses comma to separate affixes. Author: Artur Zakirov	2016-03-11 19:47:50 +03:00
Teodor Sigaev	a9eb6c83ef	Bump catalog version missed in `6943a946c7`	2016-03-11 19:31:04 +03:00
Teodor Sigaev	6943a946c7	Tsvector editing functions Adds several tsvector editting function: convert tsvector to/from text array, set weight for given lexemes, delete lexeme(s), unnest, filter lexemes with given weights Author: Stas Kelvich with some editorization by me Reviewers: Tomas Vondram, Teodor Sigaev	2016-03-11 19:22:36 +03:00
Tom Lane	49635d7b3e	Minor additional refactoring of planner.c's PathTarget handling. Teach make_group_input_target() and make_window_input_target() to work entirely with the PathTarget representation of tlists, rather than constructing a tlist and immediately deconstructing it into PathTarget format. In itself this only saves a few palloc's; the bigger picture is that it opens the door for sharing cost_qual_eval work across all of planner.c's constructions of PathTargets. I'll come back to that later. In support of this, flesh out tlist.c's infrastructure for PathTargets a bit more.	2016-03-11 10:24:55 -05:00
Robert Haas	69ab7b9d6c	psql: Don't automatically use expanded format when there's 1 column. Andreas Karlsson and Robert Haas	2016-03-11 08:04:01 -05:00
Robert Haas	481c76abf4	Fix a typo, and remove unnecessary pgstat_report_wait_end(). Per Amit Kapila.	2016-03-11 07:34:00 -05:00
Magnus Hagander	38c83c9b75	Refactor receivelog.c parameters Much cruft had accumulated over time with a large number of parameters passed down between functions very deep. With this refactoring, instead introduce a StreamCtl structure that holds the parameters, and pass around a pointer to this structure instead. This makes it much easier to add or remove fields that are needed deeper down in the implementation without having to modify every function header in the file. Patch by me after much nagging from Andres Reviewed by Craig Ringer and Daniel Gustafsson	2016-03-11 11:15:12 +01:00
Simon Riggs	73e7e49da3	Allow emit_log_hook to see original message text emit_log_hook could only see the translated text, making it harder to identify which message was being sent. Pass original text to allow the exact message to be identified, whichever language is used for logging. Discussion: 20160216.184755.59721141.horiguchi.kyotaro@lab.ntt.co.jp Author: Kyotaro Horiguchi	2016-03-11 09:53:06 +00:00
Robert Haas	a414d96ad2	Simplify GetLockNameFromTagType. The old code is wrong, because it returns a pointer to an automatic variable. And it's also more clever than we really need to be considering that the case it's worrying about should never happen.	2016-03-10 21:37:22 -05:00
Andres Freund	c94f0c29ce	Blindly try to fix dtrace enabled builds, broken in `9cd00c45`. Reported-By: Peter Eisentraut Discussion: 56E2239E.1050607@gmx.net	2016-03-10 17:51:03 -08:00
Andres Freund	9cd00c457e	Checkpoint sorting and balancing. Up to now checkpoints were written in the order they're in the BufferDescriptors. That's nearly random in a lot of cases, which performs badly on rotating media, but even on SSDs it causes slowdowns. To avoid that, sort checkpoints before writing them out. We currently sort by tablespace, relfilenode, fork and block number. One of the major reasons that previously wasn't done, was fear of imbalance between tablespaces. To address that balance writes between tablespaces. The other prime concern was that the relatively large allocation to sort the buffers in might fail, preventing checkpoints from happening. Thus pre-allocate the required memory in shared memory, at server startup. This particularly makes it more efficient to have checkpoint flushing enabled, because that'll often result in a lot of writes that can be coalesced into one flush. Discussion: alpine.DEB.2.10.1506011320000.28433@sto Author: Fabien Coelho and Andres Freund	2016-03-10 17:05:09 -08:00
Andres Freund	428b1d6b29	Allow to trigger kernel writeback after a configurable number of writes. Currently writes to the main data files of postgres all go through the OS page cache. This means that some operating systems can end up collecting a large number of dirty buffers in their respective page caches. When these dirty buffers are flushed to storage rapidly, be it because of fsync(), timeouts, or dirty ratios, latency for other reads and writes can increase massively. This is the primary reason for regular massive stalls observed in real world scenarios and artificial benchmarks; on rotating disks stalls on the order of hundreds of seconds have been observed. On linux it is possible to control this by reducing the global dirty limits significantly, reducing the above problem. But global configuration is rather problematic because it'll affect other applications; also PostgreSQL itself doesn't always generally want this behavior, e.g. for temporary files it's undesirable. Several operating systems allow some control over the kernel page cache. Linux has sync_file_range(2), several posix systems have msync(2) and posix_fadvise(2). sync_file_range(2) is preferable because it requires no special setup, whereas msync() requires the to-be-flushed range to be mmap'ed. For the purpose of flushing dirty data posix_fadvise(2) is the worst alternative, as flushing dirty data is just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages from the page cache. Thus the feature is enabled by default only on linux, but can be enabled on all systems that have any of the above APIs. While desirable and likely possible this patch does not contain an implementation for windows. With the infrastructure added, writes made via checkpointer, bgwriter and normal user backends can be flushed after a configurable number of writes. Each of these sources of writes controlled by a separate GUC, checkpointer_flush_after, bgwriter_flush_after and backend_flush_after respectively; they're separate because the number of flushes that are good are separate, and because the performance considerations of controlled flushing for each of these are different. A later patch will add checkpoint sorting - after that flushes from the ckeckpoint will almost always be desirable. Bgwriter flushes are most of the time going to be random, which are slow on lots of storage hardware. Flushing in backends works well if the storage and bgwriter can keep up, but if not it can have negative consequences. This patch is likely to have negative performance consequences without checkpoint sorting, but unfortunately so has sorting without flush control. Discussion: alpine.DEB.2.10.1506011320000.28433@sto Author: Fabien Coelho and Andres Freund	2016-03-10 17:04:34 -08:00
Tom Lane	c82c92b111	Give pull_var_clause() reject/recurse/return behavior for WindowFuncs too. All along, this function should have treated WindowFuncs in a manner similar to Aggrefs, ie with an option whether or not to recurse into them. By not considering the case, it was always recursing, which is OK for most callers (although I suspect that the case in prepare_sort_from_pathkeys might represent a bug). But now we need return-without-recursing behavior as well. There are also more than a few callers that should never see a WindowFunc, and now we'll get some error checking on that.	2016-03-10 16:23:52 -05:00
Robert Haas	fd31cd2651	Don't vacuum all-frozen pages. Commit `a892234f83` gave us enough infrastructure to avoid vacuuming pages where every tuple on the page is already frozen. So, replace the notion of a scan_all or whole-table vacuum with the less onerous notion of an "aggressive" vacuum, which will pages that are all-visible, but still skip those that are all-frozen. This should greatly reduce the cost of anti-wraparound vacuuming on large clusters where the majority of data is never touched between one cycle and the next, because we'll no longer have to read all of those pages only to find out that we don't need to do anything with them. Patch by me, reviewed by Masahiko Sawada.	2016-03-10 16:14:42 -05:00
Tom Lane	364a9f47ab	Refactor pull_var_clause's API to make it less tedious to extend. In commit `1d97c19a0f` and later `c1d9579dd8`, we extended pull_var_clause's API by adding enum-type arguments. That's sort of a pain to maintain, though, because it means every time we add a new behavior we must touch every last one of the call sites, even if there's a reasonable default behavior that most of them could use. Let's switch over to using a bitmask of flags, instead; that seems more maintainable and might save a nanosecond or two as well. This commit changes no behavior in itself, though I'm going to follow it up with one that does add a new behavior. In passing, remove flatten_tlist(), which has not been used since 9.1 and would otherwise need the same API changes. Removing these enums means that optimizer/tlist.h no longer needs to depend on optimizer/var.h. Changing that caused a number of C files to need addition of #include "optimizer/var.h" (probably we can thank old runs of pgrminclude for that); but on balance it seems like a good change anyway.	2016-03-10 15:53:07 -05:00
Simon Riggs	37c54863cf	Rework wait for AccessExclusiveLocks on Hot Standby Earlier version committed in 9.0 caused spurious waits in some cases. New infrastructure for lock waits in 9.3 used to correct and improve this. Jeff Janes based upon a proposal by Simon Riggs, who also reviewed Additional review comments from Amit Kapila	2016-03-10 19:26:24 +00:00
Robert Haas	53be0b1add	Provide much better wait information in pg_stat_activity. When a process is waiting for a heavyweight lock, we will now indicate the type of heavyweight lock for which it is waiting. Also, you can now see when a process is waiting for a lightweight lock - in which case we will indicate the individual lock name or the tranche, as appropriate - or for a buffer pin. Amit Kapila, Ildus Kurbangaliev, reviewed by me. Lots of helpful discussion and suggestions by many others, including Alexander Korotkov, Vladimir Borodin, and many others.	2016-03-10 12:44:09 -05:00
Magnus Hagander	9d90388247	Avoid crash on old Windows with AVX2-capable CPU for VS2013 builds The Visual Studio 2013 CRT generates invalid code when it makes a 64-bit build that is later used on a CPU that supports AVX2 instructions using a version of Windows before 7SP1/2008R2SP1. Detect this combination, and in those cases turn off the generation of FMA3, per recommendation from the Visual Studio team. The bug is actually in the CRT shipping with Visual Studio 2013, but Microsoft have stated they're only fixing it in newer major versions. The fix is therefor conditioned specifically on being built with this version of Visual Studio, and not previous or later versions. Author: Christian Ullrich	2016-03-10 14:10:18 +01:00
Simon Riggs	e0694cf9c7	Reduce size of two phase file header Previously 2PC header was fixed at 200 bytes, which in most cases wasted WAL space for a workload using 2PC heavily. Pavan Deolasee, reviewed by Petr Jelinek	2016-03-10 12:51:46 +00:00
Simon Riggs	fcb4bfddb6	Reduce lock level for altering fillfactor Fabrízio de Royes Mello and Simon Riggs	2016-03-10 12:07:33 +00:00
Robert Haas	090b287fc5	Code review for `b6fb6471f6`. Reports by Tomas Vondra, Vinayak Pokale, and Aleksander Alekseev. Patch by Amit Langote.	2016-03-10 06:07:57 -05:00
Tom Lane	cc402116ca	Remove a couple of useless pstrdup() calls. There's no point in pstrdup'ing the result of TextDatumGetCString, since that's necessarily already a freshly-palloc'd C string. These particular calls are unlikely to be of any consequence performance-wise, but still they're a bad precedent that can confuse future patch authors. Noted by Chapman Flack.	2016-03-09 23:29:05 -05:00
Andres Freund	1d4a0ab19a	Avoid unlikely data-loss scenarios due to rename() without fsync. Renaming a file using rename(2) is not guaranteed to be durable in face of crashes. Use the previously added durable_rename()/durable_link_or_rename() in various places where we previously just renamed files. Most of the changed call sites are arguably not critical, but it seems better to err on the side of too much durability. The most prominent known case where the previously missing fsyncs could cause data loss is crashes at the end of a checkpoint. After the actual checkpoint has been performed, old WAL files are recycled. When they're filled, their contents are fdatasynced, but we did not fsync the containing directory. An OS/hardware crash in an unfortunate moment could then end up leaving that file with its old name, but new content; WAL replay would thus not replay it. Reported-By: Tomas Vondra Author: Michael Paquier, Tomas Vondra, Andres Freund Discussion: 56583BDD.9060302@2ndquadrant.com Backpatch: All supported branches	2016-03-09 18:53:53 -08:00
Andres Freund	606e0f9841	Introduce durable_rename() and durable_link_or_rename(). Renaming a file using rename(2) is not guaranteed to be durable in face of crashes; especially on filesystems like xfs and ext4 when mounted with data=writeback. To be certain that a rename() atomically replaces the previous file contents in the face of crashes and different filesystems, one has to fsync the old filename, rename the file, fsync the new filename, fsync the containing directory. This sequence is not generally adhered to currently; which exposes us to data loss risks. To avoid having to repeat this arduous sequence, introduce durable_rename(), which wraps all that. Also add durable_link_or_rename(). Several places use link() (with a fallback to rename()) to rename a file, trying to avoid replacing the target file out of paranoia. Some of those rename sequences need to be durable as well. There seems little reason extend several copies of the same logic, so centralize the link() callers. This commit does not yet make use of the new functions; they're used in a followup commit. Author: Michael Paquier, Andres Freund Discussion: 56583BDD.9060302@2ndquadrant.com Backpatch: All supported branches	2016-03-09 18:53:53 -08:00
Alvaro Herrera	28f6df3c36	PostgresNode: add backup_fs_hot and backup_fs_cold These simple methods rely on RecursiveCopy to create a filesystem-level backup of a server. They aren't currently used anywhere yet,but will be useful for future tests. Author: Craig Ringer Reviewed-By: Michael Paquier, Salvador Fandino, Álvaro Herrera Commitfest-URL: https://commitfest.postgresql.org/9/569/	2016-03-09 19:54:03 -03:00
Alvaro Herrera	a31aaec406	Add filter capability to RecursiveCopy::copypath This allows skipping copying certain files and subdirectories in tests. This is useful in some circumstances such as copying a data directory; future tests want this feature. Also POD-ify the module. Authors: Craig Ringer, Pallavi Sontakke Reviewed-By: Álvaro Herrera	2016-03-09 18:00:31 -03:00
Tom Lane	a298a1e06f	Fix incorrect handling of NULL index entries in indexed ROW() comparisons. An index search using a row comparison such as ROW(a, b) > ROW('x', 'y') would stop upon reaching a NULL entry in the "b" column, ignoring the fact that there might be non-NULL "b" values associated with later values of "a". This happens because _bt_mark_scankey_required() marks the subsidiary scankey for "b" as required, which is just wrong: it's for a column after the one with the first inequality key (namely "a"), and thus can't be considered a required match. This bit of brain fade dates back to the very beginnings of our support for indexed ROW() comparisons, in 2006. Kind of astonishing that no one came across it before Glen Takahashi, in bug #14010. Back-patch to all supported versions. Note: the given test case doesn't actually fail in unpatched 9.1, evidently because the fix for bug #6278 (i.e., stopping at nulls in either scan direction) is required to make it fail. I'm sure I could devise a case that fails in 9.1 as well, perhaps with something involving making a cursor back up; but it doesn't seem worth the trouble.	2016-03-09 14:51:22 -05:00
Robert Haas	be060cbcd4	Re-pgindent vacuumlazy.c.	2016-03-09 13:51:11 -05:00
Robert Haas	accf7616ff	pgbench: When -T is used, don't wait for transactions beyond end of run. At low rates, this can lead to pgbench taking significantly longer to terminate than the user might expect. Repair. Fabien Coelho, reviewed by Aleksander Alekseev, Álvaro Herrera, and me.	2016-03-09 13:11:05 -05:00
Robert Haas	b6fb6471f6	Add a generic command progress reporting facility. Using this facility, any utility command can report the target relation upon which it is operating, if there is one, and up to 10 64-bit counters; the intent of this is that users should be able to figure out what a utility command is doing without having to resort to ugly hacks like attaching strace to a backend. As a demonstration, this adds very crude reporting to lazy vacuum; we just report the target relation and nothing else. A forthcoming patch will make VACUUM report a bunch of additional data that will make this much more interesting. But this gets the basic framework in place. Vinayak Pokale, Rahila Syed, Amit Langote, Robert Haas, reviewed by Kyotaro Horiguchi, Jim Nasby, Thom Brown, Masahiko Sawada, Fujii Masao, and Masanori Oyama.	2016-03-09 12:08:58 -05:00
Tom Lane	8776c15c85	Fix incorrect tlist generation in create_gather_plan(). This function is written as though Gather doesn't project; but it does. Even if it did not project, though, we must use build_path_tlist to ensure that the output columns receive correct sortgroupref labeling. Per report from Amit Kapila.	2016-03-09 10:56:46 -05:00
Tom Lane	d31f20e2b5	Fix copy-and-pasteo in comment. Wensheng Zhang	2016-03-09 10:29:14 -05:00

... 3 4 5 6 7 ...

28328 Commits