postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-07-19 13:11:12 +02:00

Author	SHA1	Message	Date
Bruce Momjian	a61daa14d5	pg_upgrade: preserve database and relation minmxid values Also set these values for pre-9.3 old clusters that don't have values to preserve. Analysis by Alvaro Backpatch through 9.3	2014-07-02 15:29:38 -04:00
Andres Freund	a36a8fa376	Rename logical decoding's pg_llog directory to pg_logical. The old name wasn't very descriptive as of actual contents of the directory, which are historical snapshots in the snapshots/ subdirectory and mappingdata for rewritten tuples in mappings/. There's been a fair amount of discussion what would be a good name. I'm settling for pg_logical because it's likely that further data around logical decoding and replication will need saving in the future. Also add the missing entry for the directory into storage.sgml's list of PGDATA contents. Bumps catversion as the data directories won't be compatible.	2014-07-02 21:07:47 +02:00
Tom Lane	7980ab30ec	Add some errdetail to checkRuleResultList(). This function wasn't originally thought to be really user-facing, because converting a table to a view isn't something we expect people to do manually. So not all that much effort was spent on the error messages; in particular, while the code will complain that you got the column types wrong it won't say exactly what they are. But since we repurposed the code to also check compatibility of rule RETURNING lists, it's definitely user-facing. It now seems worthwhile to add errdetail messages showing exactly what the conflict is when there's a mismatch of column names or types. This is prompted by bug #10836 from Matthias Raffelsieper, which might have been forestalled if the error message had reported the wrong column type as being "record". Back-patch to 9.4, but not into older branches where the set of translatable error strings is supposed to be stable.	2014-07-02 12:31:24 -04:00
Fujii Masao	d97e98e833	Prevent psql from issuing BEGIN before ALTER SYSTEM when AUTOCOMMIT is off. The autocommit-off mode works by issuing an implicit BEGIN just before any command that is not already in a transaction block and is not itself a BEGIN or other transaction-control command, nor a command that cannot be executed inside a transaction block. This commit prevents psql from issuing such an implicit BEGIN before ALTER SYSTEM because it's not allowed inside a transaction block. Backpatch to 9.4 where ALTER SYSTEM was added. Report by Feike Steenbergen	2014-07-02 12:42:20 +09:00
Tom Lane	fbb1d7d73f	Allow CREATE/ALTER DATABASE to manipulate datistemplate and datallowconn. Historically these database properties could be manipulated only by manually updating pg_database, which is error-prone and only possible for superusers. But there seems no good reason not to allow database owners to set them for their databases, so invent CREATE/ALTER DATABASE options to do that. Adjust a couple of places that were doing it the hard way to use the commands instead. Vik Fearing, reviewed by Pavel Stehule	2014-07-01 20:10:38 -04:00
Tom Lane	15c82efd69	Refactor CREATE/ALTER DATABASE syntax so options need not be keywords. Most of the existing option names are keywords anyway, but we can get rid of LC_COLLATE and LC_CTYPE as keywords known to the lexer/grammar. This immediately reduces the size of the grammar tables by about 8KB, and will save more when we add additional CREATE/ALTER DATABASE options in future. A side effect of the implementation is that the CONNECTION LIMIT option can now also be spelled CONNECTION_LIMIT. We choose not to document this, however. Vik Fearing, based on a suggestion by me; reviewed by Pavel Stehule	2014-07-01 19:02:21 -04:00
Tom Lane	2e8ce9ae46	Remove some useless code in the configure script. Almost ten years ago, commit `e48322a6d6` broke the logic in ACX_PTHREAD by looping through all the possible flags rather than stopping with the first one that would work. This meant that $acx_pthread_ok was no longer meaningful after the loop; it would usually be "no", whether or not we'd found working thread flags. The reason nobody noticed is that Postgres doesn't actually use any of the symbols set up by the code after the loop. Rather than complicate things some more to make it work as designed, let's just remove all that dead code, and thereby save a few cycles in each configure run.	2014-07-01 17:51:53 -04:00
Robert Haas	9f03ca9151	Avoid copying index tuples when building an index. The previous code, perhaps out of concern for avoid memory leaks, formed the tuple in one memory context and then copied it to another memory context. However, this doesn't appear to be necessary, since index_form_tuple and the functions it calls take precautions against leaking memory. In my testing, building the tuple directly inside the sort context shaves several percent off the index build time. Rearrange things so we do that. Patch by me. Review by Amit Kapila, Tom Lane, Andres Freund.	2014-07-01 10:34:42 -04:00
Andres Freund	1cbc948010	Check interrupts during logical decoding more frequently. When reading large amounts of preexisting WAL during logical decoding using the SQL interface we possibly could fail to check interrupts in due time. Similarly the same could happen on systems with a very high WAL volume while creating a new logical replication slot, independent of the used interface. Previously these checks where only performed in xlogreader's read_page callbacks, while waiting for new WAL to be produced. That's not sufficient though, if there's never a need to wait. Walsender's send loop already contains a interrupt check. Backpatch to 9.4 where the logical decoding feature was introduced.	2014-06-30 10:49:39 +02:00
Heikki Linnakangas	1c6821be31	Fix and enhance the assertion of no palloc's in a critical section. The assertion failed if WAL_DEBUG or LWLOCK_STATS was enabled; fix that by using separate memory contexts for the allocations made within those code blocks. This patch introduces a mechanism for marking any memory context as allowed in a critical section. Previously ErrorContext was exempt as a special case. Instead of a blanket exception of the checkpointer process, only exempt the memory context used for the pending ops hash table.	2014-06-30 10:26:00 +03:00
Tom Lane	a749a23d7a	Remove use_json_as_text options from json_to_record/json_populate_record. The "false" case was really quite useless since all it did was to throw an error; a definition not helped in the least by making it the default. Instead let's just have the "true" case, which emits nested objects and arrays in JSON syntax. We might later want to provide the ability to emit sub-objects in Postgres record or array syntax, but we'd be best off to drive that off a check of the target field datatype, not a separate argument. For the functions newly added in 9.4, we can just remove the flag arguments outright. We can't do that for json_populate_record[set], which already existed in 9.3, but we can ignore the argument and always behave as if it were "true". It helps that the flag arguments were optional and not documented in any useful fashion anyway.	2014-06-29 13:50:58 -04:00
Andres Freund	51adcaa0df	Add cluster_name GUC which is included in process titles if set. When running several postgres clusters on one OS instance it's often inconveniently hard to identify which "postgres" process belongs to which postgres instance. Add the cluster_name GUC, whose value will be included as part of the process titles if set. With that processes can more easily identified using tools like 'ps'. To avoid problems with encoding mismatches between postgresql.conf, consoles, and individual databases replace non-ASCII chars in the name with question marks. The length is limited to NAMEDATALEN to make it less likely to truncate important information at the end of the status. Thomas Munro, with some adjustments by me and review by a host of people.	2014-06-29 14:15:09 +02:00
Andres Freund	a6d488cb53	Remove Alpha and Tru64 support. Support for running postgres on Alpha hasn't been tested for a long while. Due to Alpha's uniquely lax cache coherency model it's a hard to develop for platform (especially blindly!) and thought to be unlikely to currently work correctly. As Alpha is the only supported architecture for Tru64 drop support for it as well. Tru64's support has ended 2012 and it has been in maintenance-only mode for much longer. Also remove stray references to __ksr__ and ultrix defines.	2014-06-28 21:46:15 +02:00
Tom Lane	d222585a9f	Allow pushdown of WHERE quals into subqueries with window functions. We can allow this even without any specific knowledge of the semantics of the window function, so long as pushed-down quals will either accept every row in a given window partition, or reject every such row. Because window functions act only within a partition, such a case can't result in changing the window functions' outputs for any surviving row. Eliminating entire partitions in this way obviously can reduce the cost of the window-function computations substantially. The fly in the ointment is that it's hard to be entirely sure whether this is true for an arbitrary qual condition. This patch allows pushdown if (a) the qual references only partitioning columns, and (b) the qual contains no volatile functions. We are at risk of incorrect results if the qual can produce different answers for values that the partitioning equality operator sees as equal. While it's not hard to invent cases for which that can happen, it seems to seldom be a problem in practice, since no one has complained about a similar assumption that we've had for many years with respect to DISTINCT. The potential performance gains seem to be worth the risk. David Rowley, reviewed by Vik Fearing; some credit is due also to Thomas Mayer who did considerable preliminary investigation.	2014-06-27 23:08:08 -07:00
Alvaro Herrera	f741300c90	Have multixact be truncated by checkpoint, not vacuum Instead of truncating pg_multixact at vacuum time, do it only at checkpoint time. The reason for doing it this way is twofold: first, we want it to delete only segments that we're certain will not be required if there's a crash immediately after the removal; and second, we want to do it relatively often so that older files are not left behind if there's an untimely crash. Per my proposal in http://www.postgresql.org/message-id/20140626044519.GJ7340@eldon.alvh.no-ip.org we now execute the truncation in the checkpointer process rather than as part of vacuum. Vacuum is in only charge of maintaining in shared memory the value to which it's possible to truncate the files; that value is stored as part of checkpoints also, and so upon recovery we can reuse the same value to re-execute truncate and reset the oldest-value-still-safe-to-use to one known to remain after truncation. Per bug reported by Jeff Janes in the course of his tests involving bug #8673. While at it, update some comments that hadn't been updated since multixacts were changed. Backpatch to 9.3, where persistency of pg_multixact files was introduced by commit `0ac5ad5134`.	2014-06-27 14:43:53 -04:00
Alvaro Herrera	b7e51d9c06	Don't allow relminmxid to go backwards during VACUUM FULL We were allowing a table's pg_class.relminmxid value to move backwards when heaps were swapped by VACUUM FULL or CLUSTER. There is a similar protection against relfrozenxid going backwards, which we neglected to clone when the multixact stuff was rejiggered by commit `0ac5ad5134`. Backpatch to 9.3, where relminmxid was introduced. As reported by Heikki in http://www.postgresql.org/message-id/52401AEA.9000608@vmware.com	2014-06-27 14:43:46 -04:00
Alvaro Herrera	b277057648	Fix broken Assert() introduced by `8e9a16ab8f` Don't assert MultiXactIdIsRunning if the multi came from a tuple that had been share-locked and later copied over to the new cluster by pg_upgrade. Doing that causes an error to be raised unnecessarily: MultiXactIdIsRunning is not open to the possibility that its argument came from a pg_upgraded tuple, and all its other callers are already checking; but such multis cannot, obviously, have transactions still running, so the assert is pointless. Noticed while investigating the bogus pg_multixact/offsets/0000 file left over by pg_upgrade, as reported by Andres Freund in http://www.postgresql.org/message-id/20140530121631.GE25431@alap3.anarazel.de Backpatch to 9.3, as the commit that introduced the buglet.	2014-06-27 14:43:39 -04:00
Tom Lane	1147035203	Disallow pushing volatile qual expressions down into DISTINCT subqueries. A WHERE clause applied to the output of a subquery with DISTINCT should theoretically be applied only once per distinct row; but if we push it into the subquery then it will be evaluated at each row before duplicate elimination occurs. If the qual is volatile this can give rise to observably wrong results, so don't do that. While at it, refactor a little bit to allow subquery_is_pushdown_safe to report more than one kind of restrictive condition without indefinitely expanding its argument list. Although this is a bug fix, it seems unwise to back-patch it into released branches, since it might de-optimize plans for queries that aren't giving any trouble in practice. So apply to 9.4 but not further back.	2014-06-27 11:08:48 -07:00
Tom Lane	f71136eeeb	Get rid of bogus separate pg_proc entries for json_extract_path operators. These should not have existed to begin with, but there was apparently some misunderstanding of the purpose of the opr_sanity regression test item that checks for operator implementation functions with their own comments. The idea there is to check for unintentional violations of the rule that operator implementation functions shouldn't be documented separately .... but for these functions, that is in fact what we want, since the variadic option is useful and not accessible via the operator syntax. Get rid of the extra pg_proc entries and fix the regression test and documentation to be explicit about what we're doing here.	2014-06-26 16:22:15 -07:00
Tom Lane	344eed91e9	Forward-patch regression test for "could not find pathkey item to sort". Commit `a87c729153` already fixed the bug this is checking for, but the regression test case it added didn't cover this scenario. Since we managed to miss the fact that there was a bug at all, it seems like a good idea to propagate the extra test case forward to HEAD.	2014-06-26 10:41:48 -07:00
Tom Lane	798e235790	Rationalize error messages within jsonfuncs.c. I noticed that the functions in jsonfuncs.c sometimes printed error messages that claimed I'd called some other function. Investigation showed that this was from repurposing code into "worker" functions without taking much care as to whether it would mention the right SQL-level function if it threw an error. Moreover, there was a weird mismash of messages that contained a fixed function name, messages that used %s for a function name, and messages that constructed a function name out of spare parts, like "json%s_populate_record" (which, quite aside from being ugly as sin, wasn't even sufficient to cover all the cases). This would put an undue burden on our long-suffering translators. Standardize on inserting the SQL function name with %s so as to reduce the number of translatable strings, and pass function names around as needed to make sure we can report the right one. Fix up some gratuitous variations in wording, too.	2014-06-25 15:25:22 -07:00
Tom Lane	8d2d7ad5ab	Cosmetic improvements in jsonfuncs.c. Re-pgindent, remove a lot of random vertical whitespace, remove useless (if not counterproductive) inline markings, get rid of unnecessary zero-padding of strings for hashtable searches. No functional changes.	2014-06-25 11:22:18 -07:00
Tom Lane	57d8c1270e	Fix handling of nested JSON objects in json_populate_recordset and friends. populate_recordset_object_start() improperly created a new hash table (overwriting the link to the existing one) if called at nest levels greater than one. This resulted in previous fields not appearing in the final output, as reported by Matti Hameister in bug #10728. In 9.4 the problem also affects json_to_recordset. This perhaps missed detection earlier because the default behavior is to throw an error for nested objects: you have to pass use_json_as_text = true to see the problem. In addition, fix query-lifespan leakage of the hashtable created by json_populate_record(). This is pretty much the same problem recently fixed in dblink: creating an intended-to-be-temporary context underneath the executor's per-tuple context isn't enough to make it go away at the end of the tuple cycle, because MemoryContextReset is not MemoryContextResetAndDeleteChildren. Michael Paquier and Tom Lane	2014-06-24 21:22:40 -07:00
Heikki Linnakangas	a87a7dc8b6	Don't allow foreign tables with OIDs. The syntax doesn't let you specify "WITH OIDS" for foreign tables, but it was still possible with default_with_oids=true. But the rest of the system, including pg_dump, isn't prepared to handle foreign tables with OIDs properly. Backpatch down to 9.1, where foreign tables were introduced. It's possible that there are databases out there that already have foreign tables with OIDs. There isn't much we can do about that, but at least we can prevent them from being created in the future. Patch by Etsuro Fujita, reviewed by Hadi Moshayedi.	2014-06-24 13:27:18 +03:00
Robert Haas	c922353b1c	Check for interrupts during tuple-insertion loops. Normally, this won't matter too much; but if I/O is really slow, for example because the system is overloaded, we might write many pages before checking for interrupts. A single toast insertion might write up to 1GB of data, and a multi-insert could write hundreds of tuples (and their corresponding TOAST data).	2014-06-23 21:45:21 -04:00
Heikki Linnakangas	631e7f6b4e	Improve tab-completion of DROP and ALTER ENABLE/DISABLE on triggers and rules. At "DROP RULE/TRIGGER triggername ON ...", tab-complete tables that have a rule/trigger with that name. At "ALTER TABLE tablename ENABLE/DISABLE TRIGGER/RULE ...", tab-complete to rules/triggers on that table. Previously, we would tab-complete to all rules or triggers, not just those that are on that table. Also, filter out internal RI triggers from the list. You can't DROP them, and enabling/disabling them is such a rare (and dangerous) operation that it seems better to hide them. Andreas Karlsson, reviewed by Ian Barwick.	2014-06-23 23:56:20 +03:00
Heikki Linnakangas	85ba0748ed	Fix bug in WAL_DEBUG. The record header was not copied correctly to the buffer that was passed to the rm_desc function. Broken by my rm_desc signature refactoring patch.	2014-06-23 12:22:36 +03:00
Tom Lane	8b38a538c0	Add Asserts to verify that catalog cache keys are unique and not null. The catcache code is effectively assuming this already, so let's insist that the catalog and index are actually declared that way. Having done that, the comments in indexing.h about non-unique indexes not being used for catcaches are completely redundant not just mostly so; and we didn't have such a comment for every such index anyway. So let's get rid of them. Per discussion of whether we should identify primary keys for catalogs. We might or might not take that further step, but this change in itself will allow quicker detection of misdeclared catcaches, so it seems worth doing in any case.	2014-06-20 18:21:05 -04:00
Andres Freund	ecac0e2b9e	Do all-visible handling in lazy_vacuum_page() outside its critical section. Since `fdf9e21196` lazy_vacuum_page() rechecks the all-visible status of pages in the second pass over the heap. It does so inside a critical section, but both visibilitymap_test() and heap_page_is_all_visible() perform operations that should not happen inside one. The former potentially performs IO and both potentially do memory allocations. To fix, simply move all the all-visible handling outside the critical section. Doing so means that the PD_ALL_VISIBLE on the page won't be included in the full page image of the HEAP2_CLEAN record anymore. But that's fine, the flag will be set by the HEAP2_VISIBLE logged later. Backpatch to 9.3 where the problem was introduced. The bug only came to light due to the assertion added in `4a170ee9` and isn't likely to cause problems in production scenarios. The worst outcome is a avoidable PANIC restart. This also gets rid of the difference in the order of operations between master and standby mentioned in `2a8e1ac5`. Per reports from David Leverton and Keith Fiske in bug #10533.	2014-06-20 11:09:17 +02:00
Andres Freund	3bdcf6a5a7	Don't allow to disable backend assertions via the debug_assertions GUC. The existance of the assert_enabled variable (backing the debug_assertions GUC) reduced the amount of knowledge some static code checkers (like coverity and various compilers) could infer from the existance of the assertion. That could have been solved by optionally removing the assertion_enabled variable from the Assert() et al macros at compile time when some special macro is defined, but the resulting complication doesn't seem to be worth the gain from having debug_assertions. Recompiling is fast enough. The debug_assertions GUC is still available, but readonly, as it's useful when diagnosing problems. The commandline/client startup option -A, which previously also allowed to enable/disable assertions, has been removed as it doesn't serve a purpose anymore. While at it, reduce code duplication in bufmgr.c and localbuf.c assertions checking for spurious buffer pins. That code had to be reindented anyway to cope with the assert_enabled removal.	2014-06-20 11:09:17 +02:00
Tom Lane	45b0f35723	Avoid leaking memory while evaluating arguments for a table function. ExecMakeTableFunctionResult evaluated the arguments for a function-in-FROM in the query-lifespan memory context. This is insignificant in simple cases where the function relation is scanned only once; but if the function is in a sub-SELECT or is on the inside of a nested loop, any memory consumed during argument evaluation can add up quickly. (The potential for trouble here had been foreseen long ago, per existing comments; but we'd not previously seen a complaint from the field about it.) To fix, create an additional temporary context just for this purpose. Per an example from MauMau. Back-patch to all active branches.	2014-06-19 22:14:26 -04:00
Noah Misch	c82725edfa	Let installcheck-world pass against a server requiring a password. Give passwords to each user created in support of an ECPG connection test case. Use SET SESSION AUTHORIZATION, not a fresh connection, to reduce privileges during a dblink test case. To test against such a server, both the "make installcheck-world" environment and the postmaster environment must provide the default user's password; $PGPASSFILE is the principal way to do so. (The postmaster environment needs it for dblink and postgres_fdw tests.)	2014-06-19 21:41:26 -04:00
Kevin Grittner	bfaa8c665f	Fix calculation of PREDICATELOCK_MANAGER_LWLOCK_OFFSET. Commit `ea9df812d8` failed to include NUM_BUFFER_PARTITIONS in this offset, resulting in a bad offset. Ultimately this threw off NUM_FIXED_LWLOCKS which is based on earlier offsets, leading to memory allocation problems. It seems likely to have also caused increased LWLOCK contention when serializable transactions were used, because lightweight locks used for that overlapped others. Reported by Amit Kapila with analysis and fix. Backpatch to 9.4, where the bug was introduced.	2014-06-19 08:40:37 -05:00
Fujii Masao	9ba78fb0b9	Don't allow data_directory to be set in postgresql.auto.conf by ALTER SYSTEM. data_directory could be set both in postgresql.conf and postgresql.auto.conf so far. This could cause some problematic situations like circular definition. To avoid such situations, this commit forbids a user to set data_directory in postgresql.auto.conf. Backpatch this to 9.4 where ALTER SYSTEM command was introduced. Amit Kapila, reviewed by Abhijit Menon-Sen, with minor adjustments by me.	2014-06-19 20:31:20 +09:00
Tom Lane	df8b7bc9ff	Improve our mechanism for controlling the Linux out-of-memory killer. Arrange for postmaster child processes to respond to two environment variables, PG_OOM_ADJUST_FILE and PG_OOM_ADJUST_VALUE, to determine whether they reset their OOM score adjustments and if so to what. This is superior to the previous design involving #ifdef's in several ways. The behavior is now available in a default build, and both ends of the adjustment --- the original adjustment of the postmaster's level and the subsequent readjustment by child processes --- can now be controlled in one place, namely the postmaster launch script. So it's no longer necessary for the launch script to act on faith that the server was compiled with the appropriate options. In addition, if someone wants to use an OOM score other than zero for the child processes, that doesn't take a recompile anymore; and we no longer have to cater separately to the two different historical kernel APIs for this adjustment. Gurjeet Singh, somewhat revised by me	2014-06-18 20:12:51 -04:00
Andrew Dunstan	960661980b	Remove unnecessary check for jbvBinary in convertJsonbValue. The check was confusing and is a condition that should never in fact happen. Per gripe from Dmitry Dolgov.	2014-06-18 19:28:20 -04:00
Tom Lane	66802246e2	Fix weird spacing in error message. Seems to have been introduced in `1a3458b6d8`.	2014-06-18 15:44:35 -04:00
Tom Lane	8f889b1083	Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ... This SQL-standard feature allows a sub-SELECT yielding multiple columns (but only one row) to be used to compute the new values of several columns to be updated. While the same results can be had with an independent sub-SELECT per column, such a workaround can require a great deal of duplicated computation. The standard actually says that the source for a multi-column assignment could be any row-valued expression. The implementation used here is tightly tied to our existing sub-SELECT support and can't handle other cases; the Bison grammar would have some issues with them too. However, I don't feel too bad about this since other cases can be converted into sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".	2014-06-18 13:22:34 -04:00
Noah Misch	230ba02d85	Fix the MSVC build process for uuid-ossp. Catch up with commit b8cc8f94730610c0189aa82dfec4ae6ce9b13e34's introduction of the HAVE_UUID_OSSP symbol to the principal build process. Back-patch to 9.4, where that commit appeared.	2014-06-18 09:21:50 -04:00
Heikki Linnakangas	b29e715143	Revert accidental change of WAL_DEBUG default. Oops.	2014-06-17 08:52:41 +03:00
Tom Lane	2146f13408	Avoid recursion when processing simple lists of AND'ed or OR'ed clauses. Since most of the system thinks AND and OR are N-argument expressions anyway, let's have the grammar generate a representation of that form when dealing with input like "x AND y AND z AND ...", rather than generating a deeply-nested binary tree that just has to be flattened later by the planner. This avoids stack overflow in parse analysis when dealing with queries having more than a few thousand such clauses; and in any case it removes some rather unsightly inconsistencies, since some parts of parse analysis were generating N-argument ANDs/ORs already. It's still possible to get a stack overflow with weirdly parenthesized input, such as "x AND (y AND (z AND ( ... )))", but such cases are not mainstream usage. The maximum depth of parenthesization is already limited by Bison's stack in such cases, anyway, so that the limit is probably fairly platform-independent. Patch originally by Gurjeet Singh, heavily revised by me	2014-06-16 15:55:30 -04:00
Bruce Momjian	ac608fe758	Use type pgsocket for Windows pipe emulation socket calls This prevents several compiler warnings on Windows.	2014-06-16 15:24:38 -04:00
Noah Misch	be76a6d39e	Secure Unix-domain sockets of "make check" temporary clusters. Any OS user able to access the socket can connect as the bootstrap superuser and proceed to execute arbitrary code as the OS user running the test. Protect against that by placing the socket in a temporary, mode-0700 subdirectory of /tmp. The pg_regress-based test suites and the pg_upgrade test suite were vulnerable; the $(prove_check)-based test suites were already secure. Back-patch to 8.4 (all supported versions). The hazard remains wherever the temporary cluster accepts TCP connections, notably on Windows. As a convenient side effect, this lets testing proceed smoothly in builds that override DEFAULT_PGSOCKET_DIR. Popular non-default values like /var/run/postgresql are often unwritable to the build user. Security: CVE-2014-0067	2014-06-14 09:41:13 -04:00
Noah Misch	9e6b1bf258	Add mkdtemp() to libpgport. This function is pervasive on free software operating systems; import NetBSD's implementation. Back-patch to 8.4, like the commit that will harness it.	2014-06-14 09:41:13 -04:00
Heikki Linnakangas	0ef0b6784c	Change the signature of rm_desc so that it's passed a XLogRecord. Just feels more natural, and is more consistent with rm_redo.	2014-06-14 10:46:48 +03:00
Noah Misch	f3fdd257a4	Harden pg_filenode_relation test against concurrent DROP TABLE. Per buildfarm member prairiedog. Back-patch to 9.4, where the test was introduced. Reviewed by Tom Lane.	2014-06-13 19:57:59 -04:00
Noah Misch	81300ea444	emacs.samples: Reliably override ".dir-locals.el". Back-patch to 9.4, where .dir-locals.el was introduced.	2014-06-13 19:57:18 -04:00
Tom Lane	3f8c23c4d3	Improve predtest.c's ability to reason about operator expressions. We have for a long time been able to prove implications and refutations between clauses structured like "expr op const" with the same subexpression and btree-related operators; for example that "x < 4" implies "x <= 5". The implication machinery is needed to detect usability of partial indexes, and the refutation machinery is needed to implement constraint exclusion. This patch extends that machinery to make proofs for operator expressions involving the same two immutable-but-not-necessarily-just-Const input expressions, ie does "expr1 op1 expr2" prove or refute "expr1 op2 expr2" or "expr2 op2 expr1"? An important example is that we can now prove "x = y" given "y = x", which formerly the code could not deduce unless x or y was a constant. We can make use of the system's knowledge of operator commutator and negator pairs, and can also make use of btree opclass relationships, for example "x < y" implies "x <= y" and refutes "x > y" (notice that neither of these could be proven just from commutator or negator links). Inspired by a gripe from Brian Dunavant. This seems more like a new feature than a bug fix, though, so no back-patch.	2014-06-13 00:02:56 -04:00
Tom Lane	c81e63d85f	Fix pg_restore's processing of old-style BLOB COMMENTS data. Prior to 9.0, pg_dump handled comments on large objects by dumping a bunch of COMMENT commands into a single BLOB COMMENTS archive object. With sufficiently many such comments, some of the commands would likely get split across bufferloads when restoring, causing failures in direct-to-database restores (though no problem would be evident in text output). This is the same type of issue we have with table data dumped as INSERT commands, and it can be fixed in the same way, by using a mini SQL lexer to figure out where the command boundaries are. Fortunately, the COMMENT commands are no more complex to lex than INSERTs, so we can just re-use the existing lexer for INSERTs. Per bug #10611 from Jacek Zalewski. Back-patch to all active branches.	2014-06-12 20:14:32 -04:00
Tom Lane	6554656ea2	Improve tuplestore's error messages for I/O failures. We should report the errno when we get a failure from functions like BufFileWrite. "ERROR: write failed" is unreasonably taciturn for a case that's well within the realm of possibility; I've seen it a couple times in the buildfarm recently, in situations that were probably out-of-disk-space, but it'd be good to see the errno to confirm it. I think this code was originally written without assuming that the buffile.c functions would return useful errno; but most other callers are assuming that, and a quick look at the buffile code gives no reason to suppose otherwise. Also, a couple of the old messages were phrased on the assumption that a short read might indicate a logic bug in tuplestore itself; but that code's pretty well tested by now, so a filesystem-level problem seems much more likely.	2014-06-12 18:59:06 -04:00
Tom Lane	70ad7ed4e8	Adjust largeobject regression test to leave a couple of LOs behind. Since we commonly test pg_dump/pg_restore by seeing whether they can dump and restore the regression test database, it behooves us to include some large objects in that test scenario. I tried to include a comment on one of these large objects to improve the test scenario further ... but it turns out that pg_upgrade fails to preserve comments on large objects, and its regression test notices the discrepancy. So uncommenting that COMMENT is a TODO for later.	2014-06-12 17:51:47 -04:00
Tom Lane	9d4444a6fc	Preserve exposed type of subquery outputs when substituting NULLs. I thought I could get away with hardcoded int4 here, but the buildfarm says differently.	2014-06-12 17:11:53 -04:00
Tom Lane	d2783bee30	Remove inadvertent copyright violation in largeobject regression test. Robert Frost is no longer with us, but his copyrights still are, so let's stop using "Stopping by Woods on a Snowy Evening" as test data before somebody decides to sue us. Wordsworth is more safely dead.	2014-06-12 16:51:02 -04:00
Tom Lane	2dd352d4b0	Add regression test to prevent future breakage of legacy query in libpq. Memorialize the expected output of the query that libpq has been using for many years to get the OIDs of large-object support functions. Although we really ought to change the way libpq does this, we must expect that this query will remain in use in the field for the foreseeable future, so until we're ready to break compatibility with old libpq versions we'd better check the results stay the same. See the recent lo_create() fiasco.	2014-06-12 15:54:13 -04:00
Tom Lane	154146d208	Rename lo_create(oid, bytea) to lo_from_bytea(). The previous naming broke the query that libpq's lo_initialize() uses to collect the OIDs of the server-side functions it requires, because that query effectively assumes that there is only one function named lo_create in the pg_catalog schema (and likewise only one lo_open, etc). While we should certainly make libpq more robust about this, the naive query will remain in use in the field for the foreseeable future, so it seems the only workable choice is to use a different name for the new function. lo_from_bytea() won a small straw poll. Back-patch into 9.4 where the new function was introduced.	2014-06-12 15:39:09 -04:00
Alvaro Herrera	7937910781	Fix typos	2014-06-12 14:01:01 -04:00
Tom Lane	55d5b3c082	Remove unnecessary output expressions from unflattened subqueries. If a sub-select-in-FROM gets flattened into the upper query, then we naturally get rid of any output columns that are defined in the sub-select text but not actually used in the upper query. However, this doesn't happen when it's not possible to flatten the subquery, for example because it contains GROUP BY, LIMIT, etc. Allowing the subquery to compute useless output columns is often fairly harmless, but sometimes it has significant performance cost: the unused output might be an expensive expression, or it might be a Var from a relation that we could remove entirely (via the join-removal logic) if only we realized that we didn't really need that Var. Situations like this are common when expanding views, so it seems worth taking the trouble to detect and remove unused outputs. Because the upper query's Var numbering for subquery references depends on positions in the subquery targetlist, we don't want to renumber the items we leave behind. Instead, we can implement "removal" by replacing the unwanted expressions with simple NULL constants. This wastes a few cycles at runtime, but not enough to justify more work in the planner.	2014-06-12 13:12:53 -04:00
Andres Freund	e04a9ccd2c	Consistency improvements for slot and decoding code. Change the order of checks in similar functions to be the same; remove a parameter that's not needed anymore; rename a memory context and expand a couple of comments. Per review comments from Amit Kapila	2014-06-12 13:33:27 +02:00
Noah Misch	4d92b15855	Have configuration templates augment, not replace, LDFLAGS. This preserves user-specified LDFLAGS; we already kept user-specified CFLAGS and CPPFLAGS. Given the shortage of complaints and the fact that any problem caused is likely to appear at build time, no back-patch. Dag-Erling Smørgrav and Noah Misch	2014-06-11 19:50:57 -04:00
Noah Misch	bd31794df7	Consistently define BUILDING_DLL during builds of src/port for Windows. The MSVC build process already did so; this fixes the principal build process to match. Both processes already did likewise for src/common. This lets server builds of src/port reference postgres.exe data symbols.	2014-06-11 19:50:41 -04:00
Noah Misch	d098b236f3	Fix typos in comments.	2014-06-11 19:50:29 -04:00
Fujii Masao	a26ae56f51	Fix typos in comments.	2014-06-11 20:54:06 +09:00
Tom Lane	fd90b5d574	Fix ancient encoding error in hungarian.stop. When we grabbed this file off the Snowball project's website, we mistakenly supposed that it was in LATIN1 encoding, but evidently it was actually in LATIN2. This resulted in ő (o-double-acute, U+0151, which is code 0xF5 in LATIN2) being misconverted into õ (o-tilde, U+00F5), as complained of in bug #10589 from Zoltán Sörös. We'd have messed up u-double-acute too, but there aren't any of those in the file. Other characters used in the file have the same codes in LATIN1 and LATIN2, which no doubt helped hide the problem for so long. The error is not only ours: the Snowball project also was confused about which encoding is required for Hungarian. But dealing with that will require source-code changes that I'm not at all sure we'll wish to back-patch. Fixing the stopword file seems reasonably safe to back-patch however.	2014-06-10 22:48:16 -04:00
Tom Lane	3bd82dd39e	Stamp shared-library minor version numbers for 9.5.	2014-06-10 21:40:21 -04:00
Tom Lane	a24c104b9a	Stamp HEAD as 9.5devel. Let the hacking begin ...	2014-06-10 21:36:13 -04:00
Tom Lane	ab76208e3d	Forward-port regression test for bug #10587 into 9.3 and HEAD. Although this bug is already fixed in post-9.2 branches, the case triggering it is quite different from what was under consideration at the time. It seems worth memorializing this example in HEAD just to make sure it doesn't get broken again in future. Extracted from commit `187ae17300`.	2014-06-09 21:37:18 -04:00
Tom Lane	c170655cc8	Fix infinite loop when splitting inner tuples in SPGiST text indexes. Previously, the code used a node label of zero both for strings that contain no bytes beyond the inner tuple's prefix, and for cases where an "allTheSame" inner tuple has to be split to allow a string with a different next byte to be inserted into it. Failing to distinguish these cases meant that if a string ending with the current prefix needed to be inserted into an allTheSame tuple, we got into an infinite loop, because after splitting the tuple we'd descend into the child allTheSame tuple and then find we need to split again. To fix, instead use -1 and -2 as the node labels for these two cases. This requires widening the node label type from "char" to int2, but fortunately SPGiST stores all pass-by-value node label types in their Datum representation, which means that this change is transparently upward compatible so far as the on-disk representation goes. We continue to recognize zero as a dummy node label for reading purposes, but will not attempt to push new index entries down into such a label, so that the loop won't occur even when dealing with an existing index. Per report from Teodor Sigaev. Back-patch to 9.2 where the faulty code was introduced.	2014-06-09 16:31:11 -04:00
Alvaro Herrera	b0b263baab	Wrap multixact/members correctly during extension, take 2 In `a50d976254` I already changed this, but got it wrong for the case where the number of members is larger than the number of entries that fit in the last page of the last segment. As reported by Serge Negodyuck in a followup to bug #8673.	2014-06-09 15:17:23 -04:00
Andres Freund	fe7337f2dc	Fix off-by-one in decoding causing one-record events to be skipped. A ReorderBufferTransaction's end_lsn, the sentPtr advocated by walsender keepalive messages, and the end location remembered by the decoding get_changes SQL functions all use the location of the last read record + 1. I.e. the LSN points to the beginning of the next record. That cannot realistically be changed without changing the replication protocol because that's how keepalive messages have worked since 9.0. The bug is that the logic inside the snapshot builder, which decides whether a transaction's contents should be decoded, assumed the start location would point towards the last byte of the last record. The reason this didn't actually cause visible problems is that currently that decision is only made for commit records. Since interesting transactions always have at least one additional record - containing actual data - we'd never skip a transaction. But if there ever were transactions, or other events, with just one record containing important information, we'd skip them after stopping and restarting logical decoding.	2014-06-05 18:27:11 +02:00
Tom Lane	5f93c37805	Add defenses against running with a wrong selection of LOBLKSIZE. It's critical that the backend's idea of LOBLKSIZE match the way data has actually been divided up in pg_largeobject. While we don't provide any direct way to adjust that value, doing so is a one-line source code change and various people have expressed interest recently in changing it. So, just as with TOAST_MAX_CHUNK_SIZE, it seems prudent to record the value in pg_control and cross-check that the backend's compiled-in setting matches the on-disk data. Also tweak the code in inv_api.c so that fetches from pg_largeobject explicitly verify that the length of the data field is not more than LOBLKSIZE. Formerly we just had Asserts() for that, which is no protection at all in production builds. In some of the call sites an overlength data value would translate directly to a security-relevant stack clobber, so it seems worth one extra runtime comparison to be sure. In the back branches, we can't change the contents of pg_control; but we can still make the extra checks in inv_api.c, which will offer some amount of protection against running with the wrong value of LOBLKSIZE.	2014-06-05 11:31:06 -04:00
Andres Freund	f0c108560b	Consistently spell a replication slot's name as slot_name. Previously there's been a mix between 'slotname' and 'slot_name'. It's not nice to be unneccessarily inconsistent in a new feature. As a post beta1 initdb now is required in the wake of `eeca4cd35e`, fix the inconsistencies. Most the changes won't affect usage of replication slots because the majority of changes is around function parameter names. The prominent exception to that is that the recovery.conf parameter 'primary_slotname' is now named 'primary_slot_name'.	2014-06-05 16:29:20 +02:00
Andres Freund	e0cb4aa89d	Move regression test listing of builtin leakproof functions to opr_sanity.sql. The original location in create_function_3.sql didn't invite the close structinity warranted for adding new leakproof functions. Add comments to the test explaining that functions should only be added after careful consideration and understanding what a leakproof function is. Per complaint from Tom Lane after `5eebb8d954`.	2014-06-05 13:54:25 +02:00
Heikki Linnakangas	8776faa81c	Adjust SP-GiST WAL record formats to reduce alignment padding. The way the code was written, the padding was copied from uninitialized memory areas.. Because the structs are local variables in the code where the WAL records are constructed, making them larger and zeroing the padding bytes would not make the code very pretty, so rather than fixing this directly by zeroing out the padding bytes, it seems more clear to not try to align the tuples in the WAL records. The redo functions are taught to copy the tuple header to a local variable to avoid unaligned access. Stable-branches have the same problem, but we can't change the WAL format there, so fix in master only. Reading a few random extra bytes at the stack is harmless in practice, so it's not worth crafting a different back-patchable fix. Per reports from Kevin Grittner and Andres Freund, using clang static analyzer and Valgrind, respectively.	2014-06-05 12:55:35 +03:00
Tom Lane	d4d48a5edd	Tweak new regression test case for better portability. Buildfarm says we get different plans on 32-bit and 64-bit platforms, probably because of MAXALIGN-related differences in memory-consumption calculations. Add some dummy WHERE clauses so that the planner estimates different sizes for the three generate_series() relations; that should stabilize the choice of join order.	2014-06-04 21:31:41 -04:00
Tom Lane	4c8ab1b91d	Add btree and hash opclasses for pg_lsn. This is needed to allow ORDER BY, DISTINCT, etc to work as expected for pg_lsn values. We had previously decided to put this off for 9.5, but in view of commit `eeca4cd35e` there's no reason to avoid a catversion bump for 9.4beta2, and this does make a pretty significant usability difference for pg_lsn. Michael Paquier, with fixes from Andres Freund and Tom Lane	2014-06-04 20:45:56 -04:00
Tom Lane	eeca4cd35e	Bump PG_CONTROL_VERSION for previous 9.4 changes. This should have been done in `6bc8ef0b7f` and/or `50e547096c`, but better late than never. If we don't change this then we risk 9.3 pg_controldata or pg_resetxlog being inappropriately used against a 9.4 pg_control file, or vice versa.	2014-06-04 18:16:17 -04:00
Andres Freund	621a99a666	Fix longstanding bug in HeapTupleSatisfiesVacuum(). HeapTupleSatisfiesVacuum() didn't properly discern between DELETE_IN_PROGRESS and INSERT_IN_PROGRESS for rows that have been inserted in the current transaction and deleted in a aborted subtransaction of the current backend. At the very least that caused problems for CLUSTER and CREATE INDEX in transactions that had aborting subtransactions producing rows, leading to warnings like: WARNING: concurrent delete in progress within table "..." possibly in an endless, uninterruptible, loop. Instead of treating InProgress xmins the same as IsCurrent ones, treat them as being distinct like the other visibility routines. As implemented this separatation can cause a behaviour change for rows that have been inserted and deleted in another, still running, transaction. HTSV will now return INSERT_IN_PROGRESS instead of DELETE_IN_PROGRESS for those. That's both, more in line with the other visibility routines and arguably more correct. The latter because a INSERT_IN_PROGRESS will make callers look at/wait for xmin, instead of xmax. The only current caller where that's possibly worse than the old behaviour is heap_prune_chain() which now won't mark the page as prunable if a row has concurrently been inserted and deleted. That's harmless enough. As a cautionary measure also insert a interrupt check before the gotos in IndexBuildHeapScan() that lead to the uninterruptible loop. There are other possible causes, like a row that several sessions try to update and all fail, for repeated loops and the cost of doing so in the retry case is low. As this bug goes back all the way to the introduction of subtransactions in `573a71a5da` backpatch to all supported releases. Reported-By: Sandro Santilli	2014-06-04 21:36:19 +02:00
Fujii Masao	654e8e4447	Save pg_stat_statements statistics file into $PGDATA/pg_stat directory at shutdown. `187492b6c2` changed pgstat.c so that the stats files were saved into $PGDATA/pg_stat directory when the server was shutdowned. But it accidentally forgot to change the location of pg_stat_statements permanent stats file. This commit fixes pg_stat_statements so that its stats file is also saved into $PGDATA/pg_stat at shutdown. Since this fix changes the file layout, we don't back-patch it to 9.3 where this oversight was introduced.	2014-06-04 12:09:45 +09:00
Andrew Dunstan	ab14a73a6c	Use EncodeDateTime instead of to_char to render JSON timestamps. Per gripe from Peter Eisentraut and Tom Lane. The output is slightly different, but still ISO 8601 compliant: to_char doesn't output the minutes when time zone offset is an integer number of hours, while EncodeDateTime outputs ":00". The code is slightly adapted from code in xml.c	2014-06-03 18:26:47 -04:00
Andrew Dunstan	0ad1a81632	Do not escape a unicode sequence when escaping JSON text. Previously, any backslash in text being escaped for JSON was doubled so that the result was still valid JSON. However, this led to some perverse results in the case of Unicode sequences, These are now detected and the initial backslash is no longer escaped. All other backslashes are still escaped. No validity check is performed, all that is looked for is \uXXXX where X is a hexidecimal digit. This is a change from the 9.2 and 9.3 behaviour as noted in the Release notes. Per complaint from Teodor Sigaev.	2014-06-03 16:11:31 -04:00
Andrew Dunstan	f30015b6d7	Output timestamps in ISO 8601 format when rendering JSON. Many JSON processors require timestamp strings in ISO 8601 format in order to convert the strings. When converting a timestamp, with or without timezone, to a JSON datum we therefore now use such a format rather than the type's default text output, in functions such as to_json(). This is a change in behaviour from 9.2 and 9.3, as noted in the release notes.	2014-06-03 13:56:53 -04:00
Tom Lane	2dfa15de55	Make plpython_unicode regression test work in more database encodings. This test previously used a data value containing U+0080, and would therefore fail if the database encoding didn't have an equivalent to that; which only about half of our supported server encodings do. We could fall back to using some plain-ASCII character, but that seems like it's losing most of the point of the test. Instead switch to using U+00A0 (no-break space), which translates into all our supported encodings except the four in the EUC_xx family. Per buildfarm testing. Back-patch to 9.1, which is as far back as this test is expected to succeed everywhere. (9.0 has the test, but without back-patching some 9.1 code changes we could not expect to get consistent results across platforms anyway.)	2014-06-03 12:01:54 -04:00
Andres Freund	44445b28d2	Set the process latch when processing recovery conflict interrupts. Because RecoveryConflictInterrupt() didn't set the process latch anything using the latter to wait for events didn't get notified about recovery conflicts. Most latch users are never the target of recovery conflicts, which explains the lack of reports about this until now. Since 9.3 two possible affected users exist though: The sql callable pg_sleep() now uses latches to wait and background workers are expected to use latches in their main loop. Both would currently wait until the end of WaitLatch's timeout. Fix by adding a SetLatch() to RecoveryConflictInterrupt(). It'd also be possible to fix the issue by having each latch user set set_latch_on_sigusr1. That seems failure prone and though, as most of these callsites won't often receive recovery conflicts and thus will likely only be tested against normal query cancels et al. It'd also be unnecessarily verbose. Backpatch to 9.1 where latches were introduced. Arguably 9.3 would be sufficient, because that's where pg_sleep() was converted to waiting on the latch and background workers got introduced; but there could be user level code making use of the latch pre 9.3.	2014-06-03 14:02:54 +02:00
Andres Freund	5eebb8d954	Use unaligned output in another regression test query to reduce diff noise. Use the unaligned/no rowcount output mode in a regression tests that shows all built-in leakproof functions. Currently a new leakproof function will often change the alignment of all existing functions, making it hard to see the actual difference and creating unnecessary patch conflicts. Noticed while looking over a patch introducing new leakproof functions.	2014-06-03 12:19:18 +02:00
Andrew Dunstan	1a4174a498	Improve the efficiency of certain jsonb get operations. Instead of iterating over jsonb structures, use the inbuilt functions findJsonbValueFromContainerLen() and getIthJsonbValueFromContainer() to extract values directly. These functions use algorithms that are O(n log n) and O(1) respectively, whereas iterating is O(n), so we should see considerable speedup here. Teodor Sigaev.	2014-06-01 19:04:02 -04:00
Tom Lane	20561acf93	On OS X, link libpython normally, ignoring the "framework" framework. As of Xcode 5.0, Apple isn't including the Python framework as part of the SDK-level files, which means that linking to it might fail depending on whether Xcode thinks you've selected a specific SDK version. According to their Tech Note 2328, they've basically deprecated the framework method of linking to libpython and are telling people to link to the shared library normally. (I'm pretty sure this is in direct contradiction to the advice they were giving a few years ago, but whatever.) Testing says that this approach works fine at least as far back as OS X 10.4.11, so let's just rip out the framework special case entirely. We do still need a special case to decide that OS X provides a shared library at all, unfortunately (I wonder why the distutils check doesn't work ...). But this is still less of a special case than before, so it's fine. Back-patch to all supported branches, since we'll doubtless be hearing about this more as more people update to recent Xcode.	2014-05-30 18:19:06 -04:00
Heikki Linnakangas	512f3b03e3	Fix typos in MSVC solution file. Michael Paquier	2014-05-30 10:32:52 +03:00
Tom Lane	71ed8b3ca7	Revert "Fix bogus %name-prefix option syntax in all our Bison files." This reverts commit `45b7abe59e`. It turns out that the %name-prefix syntax without "=" does not work at all in pre-2.4 Bison. We are not prepared to make such a large jump in minimum required Bison version just to suppress a warning message in a version hardly any developers are using yet. When 3.0 gets more popular, we'll figure out a way to deal with this. In the meantime, BISONFLAGS=-Wno-deprecated is recommendable for anyone using 3.0 who doesn't want to see the warning.	2014-05-28 19:21:01 -04:00
Andres Freund	21d48d66c8	Don't pay heed to wal_sender_timeout while creating a decoding slot. Sometimes CREATE_REPLICATION_SLOT ... LOGICAL ... needs to wait for further WAL using WalSndWaitForWal(). That used to always respect wal_sender_timeout and kill the session when waiting long enough because no feedback/ping messages can be sent while the slot is still being created. Introduce the notion that last_reply_timestamp = 0 means that the walsender currently doesn't need timeout processing to avoid that problem. Use that notion for CREATE_REPLICATION_SLOT ... LOGICAL. Bugreport and initial patch by Steve Singer, revised by me.	2014-05-29 00:32:09 +02:00
Heikki Linnakangas	d1d50bff24	Minor refactoring of jsonb_util.c The only caller of compareJsonbScalarValue that needed locale-sensitive comparison of strings was also the only caller that didn't just check for equality. Separate the two cases for clarity: compareJsonbScalarValue now does locale-sensitive comparison, and a new function, equalsJsonbScalarValue, just checks for equality.	2014-05-28 23:48:02 +03:00
Heikki Linnakangas	b3e5cfd5f9	Jsonb comparison bug fixes. Fix an over-zealous assertion, which didn't take into account that sometimes a scalar element can be compared against an array/object element. Avoid comparing possibly-uninitialized local variables when end-of-array or end-of-object is reached. Also fix and enhance comments a bit. Peter Geoghegan, per reports by Pavel Stehule and me.	2014-05-28 22:47:04 +03:00
Tom Lane	45b7abe59e	Fix bogus %name-prefix option syntax in all our Bison files. %name-prefix doesn't use an "=" sign according to the Bison docs, but it silently accepted one anyway, until Bison 3.0. This was originally a typo of mine in commit `012abebab1`, and we seem to have slavishly copied the error into all the other grammar files. Per report from Vik Fearing; analysis by Peter Eisentraut. Back-patch to all active branches, since somebody might try to build a back branch with up-to-date tools.	2014-05-28 15:41:53 -04:00
Magnus Hagander	8232d6df4c	Ensure cleanup in case of early errors in streaming base backups Move the code that sends the initial status information as well as the calculation of paths inside the ENSURE_ERROR_CLEANUP block. If this code failed, we would "leak" a counter of number of concurrent backups, thereby making the system always believe it was in backup mode. This could happen if the sending failed (which it probably never did given that the small amount of data to send would never cause a flush) or if the psprintf calls ran out of memory. Both are very low risk, but all operations after do_pg_start_backup should be protected.	2014-05-28 12:43:29 +02:00
Tom Lane	ec3357a3bc	pg_lsn should not be marked typispreferred. In general it's not a good idea for built-in types in the 'U' category to be marked preferred; they could draw behavior away from user-defined types with similarly-named operators. pg_lsn is probably at low risk of that right now given the lack of casts between it and other types, but that doesn't make this marking OK. Ordinarily we'd bump catversion when changing any predefined catalog contents like this, but since we're past beta1, the costs of a forced initdb seem to outweigh the benefits of guaranteed behavioral consistency. There's not any known behavioral impact today anyway --- this is more in the nature of being sure there's not problems in future. Per an off-list complaint from Thomas Fanghaenel.	2014-05-28 00:26:46 -04:00
Tom Lane	8600031147	Fix obsolete config-module-exclusion logic in vcregress.pl. The recent addition of regression tests to uuid-ossp exposed the fact that the MSVC build system wasn't being consistent about whether it was building/testing that contrib module, ie, it would try to test the module even when it hadn't built it. The same hazard was latent for sslinfo. For the moment I just copied the more up-to-date logic from point A to point B, but this is screaming for refactoring. Per buildfarm results.	2014-05-27 22:31:21 -04:00
Tom Lane	4bcb394624	Propagate system identifier generation improvement into pg_resetxlog. Commit `5035701e07` improved xlog.c's method for creating a database system identifier, but I neglected to fix the copy of that code appearing in pg_resetxlog.c. Spotted by Andres Freund.	2014-05-27 22:01:13 -04:00
Tom Lane	b8cc8f9473	Support BSD and e2fsprogs UUID libraries alongside OSSP UUID library. Allow the contrib/uuid-ossp extension to be built atop any one of these three popular UUID libraries. (The extension's name is now arguably a misnomer, but we'll keep it the same so as not to cause unnecessary compatibility issues for users.) We would not normally consider a change like this post-beta1, but the issue has been forced by our upgrade to autoconf 2.69, whose more rigorous header checks are causing OSSP's header files to be rejected on some platforms. It's been foreseen for some time that we'd have to move away from depending on OSSP UUID due to lack of upstream maintenance, so this is a down payment on that problem. While at it, add some simple regression tests, in hopes of catching any major incompatibilities between the three implementations. Matteo Beccati, with some further hacking by me	2014-05-27 19:42:08 -04:00
Andres Freund	bf2e70ba6c	Fix pg_recvlogical to accept the documented -I instead only --startpos. The bug was caused by omitting 'I:' from the short argument list to getopt_long(). To make similar bugs in the future less likely reorder options in --help, long and short option lists to be in the same, alphabetical within groups, order. Report and fix by Michael Paquier, some additional reordering by me.	2014-05-26 07:25:48 +02:00
Peter Eisentraut	0a5faaa907	Small typo and formatting fixes in postgresql.conf.sample	2014-05-25 23:21:41 -04:00
Heikki Linnakangas	8da3183780	Fix error when trying to delete page with half-dead left sibling. The new page deletion code didn't cope with the case the target page's right sibling was marked half-dead. It failed a sanity check which checked that the downlinks in the parent page match the lower level, because a half-dead page has no downlink. To cope, check for that condition, and just give up on the deletion if it happens. The vacuum will finish the deletion of the half-dead page when it gets there, and on the next vacuum after that the empty can be deleted. Reported by Jeff Janes.	2014-05-25 18:18:09 -04:00
Andres Freund	9fa93530c8	Don't allocate memory inside an Assert() iff in a critical section. HeapTupleHeaderGetCmax() asserts that it is only used if the tuple has been updated by the current transaction. That check is correct and sensible but requires allocating memory if xmax is a multixact. When wal_level is set to logical cmax needs to be included in a wal record , generated inside a critical section, which can trigger the assertion added in `4a170ee9e`. Reported-By: Steve Singer	2014-05-25 17:54:53 +02:00
Andres Freund	0564bbe7a1	Silence a couple of spurious valgrind warnings in inval.c. Define padding bytes in SharedInvalidationMessage structs to be defined. Otherwise the sinvaladt.c ringbuffer, which is accessed by multiple processes, will cause spurious valgrind warnings about undefined memory being used. That's because valgrind remembers the undefined bytes from the last local process's store, not realizing that another process has written since, filling the previously uninitialized bytes.	2014-05-24 17:34:22 +02:00
Heikki Linnakangas	57b7e83b0d	Fix misc typos in comments.	2014-05-23 08:16:21 -04:00
Robert Haas	11ad3b35c2	Remove unnecessary cleanup code. This is all inside a block guarded by op == DSM_OP_ATTACH, so it can never be the case that op == DSM_OP_CREATE. Reported by Coverity.	2014-05-22 10:41:48 -04:00
Fujii Masao	06db9cce22	Fix typo in comment. Erik Rijkers	2014-05-22 16:31:55 +09:00
Fujii Masao	19a683f69f	Fix typos in comments.	2014-05-22 12:43:50 +09:00
Heikki Linnakangas	51f41e8c0a	Fix typos in comments.	2014-05-21 23:19:01 -04:00
Peter Eisentraut	e12d7320ca	Fix spurious tab character	2014-05-21 08:00:39 -04:00
Tom Lane	e416830a29	Prevent auto_explain from changing the output of a user's EXPLAIN. Commit `af7914c662`, which introduced the EXPLAIN (TIMING) option, for some reason coded explain.c to look at planstate->instrument->need_timer rather than es->timing to decide whether to print timing info. However, the former flag might get set as a result of contrib/auto_explain wanting timing information. We certainly don't want activation of auto_explain to change user-visible statement behavior, so fix that. Also fix an independent bug introduced in the same patch: in the code path for a never-executed node with a machine-friendly output format, if timing was selected, it would fail to print the Actual Rows and Actual Loops items. Per bug #10404 from Tomonari Katsumata. Back-patch to 9.2 where the faulty code was introduced.	2014-05-20 12:20:47 -04:00
Tom Lane	a0841ecd25	Update obsolete comment. Peter Geoghegan	2014-05-19 16:38:49 -04:00
Heikki Linnakangas	4e92f78012	Message style fixes to pg_recvlogical Lowercase help statements. Use an existing message to reduce the number of strings to be translated. Euler Taveira	2014-05-19 14:15:21 +03:00
Heikki Linnakangas	c91a9b5a28	Fix backup-block numbering in redo of b-tree split. I got the backup block numbers off-by-one in the commit that changed the way incomplete-splits are handled. I blame the comments, which said "backup block 1" and "backup block 2", even though the backup blocks are numbered starting from 0, in the macros and functions used in replay. Fix the comments and the code. Per Jeff Janes' bug report about corruption caused by torn page writes. The incorrect code is new in git master, but backpatch the comment change down to 9.0, where the numbering in the redo-side macros was changed.	2014-05-19 13:28:04 +03:00
Tom Lane	0c19aaba22	Ooops, I broke initdb with that last patch. That's what I get for not fully retesting the final version of the patch. The replace_allowed cross-check needs an additional special case for bootstrapping.	2014-05-18 18:17:55 -04:00
Tom Lane	078b2ed291	Fix two ancient memory-leak bugs in relcache.c. RelationCacheInsert() ignored the possibility that hash_search(HASH_ENTER) might find a hashtable entry already present for the same OID. However, that can in fact occur during recursive relcache load scenarios. When it did happen, we overwrote the pointer to the pre-existing Relation, causing a session-lifespan leakage of that entire structure. As far as is known, the pre-existing Relation would always have reference count zero by the time we arrive back at the outer insertion, so add code that deletes the pre-existing Relation if so. If by some chance its refcount is positive, elog a WARNING and allow the pre-existing Relation to be leaked as before. Also, AttrDefaultFetch() was sloppy about leaking the cstring form of the pg_attrdef.adbin value it's copying into the relcache structure. This is only a query-lifespan leakage, and normally not very significant, but it adds up during CLOBBER_CACHE testing. These bugs are of very ancient vintage, but I'll refrain from back-patching since there's no evidence that these leaks amount to anything in ordinary usage.	2014-05-18 16:51:46 -04:00
Tom Lane	44cd47c1d4	Make fallback implementation of pg_memory_barrier() work. The fallback implementation involves acquiring and releasing a spinlock variable that is otherwise unreferenced --- not even to the extent of initializing it. This accidentally fails to fail on platforms where spinlocks should be initialized to zeroes, but elsewhere it results in a "stuck spinlock" failure during startup. I griped about this last July, and put in a hack that worked for gcc on HPPA, but didn't get around to fixing the general case. Per the discussion back then, the best thing to do seems to be to initialize dummy_spinlock in main.c.	2014-05-17 18:29:46 -04:00
Tom Lane	c1907f0cc4	Fix a bunch of functions that were declared static then defined not-static. Per testing with a compiler that whines about this.	2014-05-17 17:57:53 -04:00
Tom Lane	6c42b2b10a	Fix unaligned accesses in DecodeUpdate(). The xl_heap_header_len structures in an XLOG_HEAP_UPDATE record aren't necessarily aligned adequately. The regular replay function for these records is aware of that, but decode.c didn't get the memo. I'm not sure why the buildfarm failed to catch this; the test_decoding test certainly blows up real good on my old HPPA box. Also, I'm pretty sure that the address arithmetic was wrong for the case of XLOG_HEAP_CONTAINS_OLD and not XLOG_HEAP_CONTAINS_NEW_TUPLE, though this apparently can't happen when logical decoding is active.	2014-05-17 15:53:21 -04:00
Heikki Linnakangas	a3655dd4a5	Update README, we don't do post-recovery cleanup actions anymore. transam/README explained how B-tree incomplete splits were tracked and fixed after recovery, as an example of handling complex actions that need multiple WAL records, but that's not how it works anymore. Explain the new paradigm.	2014-05-17 13:55:03 +03:00
Tom Lane	7894ac5004	Make sure chr(int) can't create invalid UTF8 sequences. Several years ago we changed chr(int) so that if the database encoding is UTF8, it would interpret its argument as a Unicode code point and expand it into the appropriate multibyte sequence. However, we weren't sufficiently careful about checking validity of the input. According to RFC3629, UTF8 disallows code points above U+10FFFF (note that the predecessor standard RFC2279 was more liberal). Also, both versions of the UTF8 spec agree that Unicode surrogate-pair codes should never appear in UTF8. Because our encoding validity checks follow RFC3629, our failure to enforce these restrictions in chr() means it could be used to produce text strings that will be rejected when the database is dumped and reloaded. To ensure consistency with the input functions, let's actually apply pg_utf8_islegal() to the proposed output of chr(). Per discussion, this seems like too much of a behavioral change to back-patch, but it's not too late to squeeze it into 9.4.	2014-05-16 16:51:28 -04:00
Heikki Linnakangas	03e2b1017c	Fix thinko in logical decoding of commit-prepared records. The decoding of prepared transaction commits accidentally used the XID of the transaction performing the COMMIT PREPARED, not the XID of the prepared transaction. Before `bb38fb0d43` that lead to those transactions not being decoded, afterwards to a assertion failure.	2014-05-16 10:53:10 +03:00
Heikki Linnakangas	e7873b74d9	Open output file before sleeping in pg_recvlogical. Let's complain about e.g an invalid path or permission problem sooner rather than later. Before this patch, we would only try to open the output file after receiving the first decoded message from the server.	2014-05-16 10:10:45 +03:00
Heikki Linnakangas	07a4a93a0e	Initialize tsId and dbId fields in WAL record of COMMIT PREPARED. Commit `dd428c79` added dbId and tsId to the xl_xact_commit struct but missed that prepared transaction commits reuse that struct. Fix that. Because those fields were left unitialized, replaying a commit prepared WAL record in a hot standby node would fail to remove the relcache init file. That can lead to "could not open file" errors on the standby. Relcache init file only needs to be removed when a system table/index is rewritten in the transaction using two phase commit, so that should be rare in practice. In HEAD, the incorrect dbId/tsId values are also used for filtering in logical replication code, causing the transaction to always be filtered out. Analysis and fix by Andres Freund. Backpatch to 9.0 where hot standby was introduced.	2014-05-16 10:10:38 +03:00
Tom Lane	f62d417825	Fix unportable setvbuf() usage in initdb. In yesterday's commit `2dc4f011fd`, I tried to force buffering of stdout/stderr in initdb to be what it is by default when the program is run interactively on Unix (since that's how most manual testing is done). This tripped over the fact that Windows doesn't support _IOLBF mode. We dealt with that a long time ago in syslogger.c by falling back to unbuffered mode on Windows. Export that solution in port.h and use it in initdb. Back-patch to 8.4, like the previous commit.	2014-05-15 15:57:54 -04:00
Heikki Linnakangas	00c26b6a60	Fix a couple of bugs in pg_recvlogical output to stdout. Don't close stdout on SIGHUP. Also, when a SIGHUP is received, close the file immediately, rather than only after receiving some more data from the server. Rename a variable, to avoid mentally dealing with double negatives (not unsynced means synced).	2014-05-15 19:47:48 +03:00
Heikki Linnakangas	8f9b9590d7	Handle duplicate XIDs in txid_snapshot. The proc array can contain duplicate XIDs, when a transaction is just being prepared for two-phase commit. To cope, remove any duplicates in txid_current_snapshot(). Also ignore duplicates in the input functions, so that if e.g. you have an old pg_dump file that already contains duplicates, it will be accepted. Report and fix by Jan Wieck. Backpatch to all supported versions.	2014-05-15 18:29:20 +03:00
Heikki Linnakangas	bb38fb0d43	Fix race condition in preparing a transaction for two-phase commit. To lock a prepared transaction's shared memory entry, we used to mark it with the XID of the backend. When the XID was no longer active according to the proc array, the entry was implicitly considered as not locked anymore. However, when preparing a transaction, the backend's proc array entry was cleared before transfering the locks (and some other state) to the prepared transaction's dummy PGPROC entry, so there was a window where another backend could finish the transaction before it was in fact fully prepared. To fix, rewrite the locking mechanism of global transaction entries. Instead of an XID, just have simple locked-or-not flag in each entry (we store the locking backend's backend id rather than a simple boolean, but that's just for debugging purposes). The backend is responsible for explicitly unlocking the entry, and to make sure that that happens, install a callback to unlock it on abort or process exit. Backpatch to all supported versions.	2014-05-15 16:37:50 +03:00
Heikki Linnakangas	ff810b4928	Misc message style and doc fixes. Euler Taveira	2014-05-15 14:49:11 +03:00
Heikki Linnakangas	a82a17475d	Silence warnings about redefining popen on Mingw-w64. Mingw-w64 headers map popen/pclose to _popen and _pclose, but we want to use our popen wrapper rather than the Mingw-w64. #undef the Mingw's version.	2014-05-15 12:18:49 +03:00
Peter Eisentraut	c424c04918	pg_ctl: Write error messages to stderr	2014-05-14 22:19:18 -04:00
Tom Lane	2dc4f011fd	In initdb, ensure stdout/stderr buffering behavior is what we expect. Since this program may print to either stdout or stderr, the relative ordering of its messages depends on the buffering behavior of those files. Force stdout to be line-buffered and stderr to be unbuffered, ensuring that the behavior will match standard Unix interactive behavior, even when stdout and stderr are rerouted to a file. Per complaint from Tomas Vondra. The particular case he pointed out is new in HEAD, but issues of the same sort could arise in any branch with other error messages, so back-patch to all branches. I'm unsure whether we might not want to do this in other client programs as well. For the moment, just fix initdb.	2014-05-14 21:14:25 -04:00
Tom Lane	b23b0f5588	Code review for recent changes in relcache.c. rd_replidindex should be managed the same as rd_oidindex, and rd_keyattr and rd_idattr should be managed like rd_indexattr. Omissions in this area meant that the bitmapsets computed for rd_keyattr and rd_idattr would be leaked during any relcache flush, resulting in a slow but permanent leak in CacheMemoryContext. There was also a tiny probability of relcache entry corruption if we ran out of memory at just the wrong point in RelationGetIndexAttrBitmap. Otherwise, the fields were not zeroed where expected, which would not bother the code any AFAICS but could greatly confuse anyone examining the relcache entry while debugging. Also, create an API function RelationGetReplicaIndex rather than letting non-relcache code be intimate with the mechanisms underlying caching of that value (we won't even mention the memory leak there). Also, fix a relcache flush hazard identified by Andres Freund: RelationGetIndexAttrBitmap must not assume that rd_replidindex stays valid across index_open. The aspects of this involving rd_keyattr date back to 9.3, so back-patch those changes.	2014-05-14 14:56:08 -04:00
Tom Lane	31a263237f	Make initdb throw error for bad locale values. Historically we've printed a complaint for a bad locale setting, but then fallen back to the environment default. Per discussion, this is not such a great idea, because rectifying an erroneous locale choice post-initdb (perhaps long after data has been loaded) could be enormously expensive. Better to complain and give the user a chance to double-check things. The behavior was particularly bad if the bad setting came from environment variables rather than a bogus command-line switch: in that case not only was there a fallback to C/SQL_ASCII, but the printed complaint was quite unhelpful. It's hard to be entirely sure what variables setlocale looked at, but we can at least give a hint where the problem might be. Per a complaint from Tomas Vondra.	2014-05-14 11:51:10 -04:00
Heikki Linnakangas	f35aef415a	Fix harmless access to uninitialized memory. When cache invalidations arrive while ri_LoadConstraintInfo() is busy filling a new cache entry, InvalidateConstraintCacheCallBack() compares the - not yet initialized - oidHashValue field with the to-be-invalidated hash value. To fix, check whether the entry is already marked as invalid. Andres Freund	2014-05-13 19:18:28 +03:00
Noah Misch	5a90ac29c4	Add Valgrind suppression for reorderbuffer padding bytes. Andres Freund	2014-05-12 23:03:49 -04:00
Tom Lane	66b737cd9a	Be more wary in choice of timezone names to test make_timestamptz with. America/Metlakatla hasn't been in the IANA database all that long, so some installations might not have it. It does seem worthwhile to test with a fractional-minute GMT offset, but we can get that from almost any pre-1900 date; I chose Europe/Paris, whose LMT offset from Greenwich should be pretty darn well established. Also, assuming that Mars/Mons_Olympus will never be in the IANA database seems less than future-proof, so let's use a more fanciful location for the bad-zone-name check. Per complaint from Christoph Berg.	2014-05-12 20:21:16 -04:00
Tom Lane	73011f35ec	Ignore config.pl and buildenv.pl in src/tools/msvc. config.pl and buildenv.pl can be used to customize build settings when using MSVC. They should never get committed into the common source tree. Back-patch to 9.0; it looks like the rules were different in 8.4. Michael Paquier	2014-05-12 14:24:18 -04:00
Heikki Linnakangas	c890b48806	Free PQresult on error in pg_receivexlog. The leak is fairly small and rare, but a leak nevertheless. Per Coverity report. Backpatch to 9.2, where pg_receivexlog was added. pg_basebackup shares the code, but it always exits on error, so there is no real leak.	2014-05-12 11:01:26 +03:00
Tom Lane	e6df2e1be6	Stamp 9.4beta1.	2014-05-11 17:16:48 -04:00
Tom Lane	195e81aff5	Find postgresql.auto.conf in PGDATA even when postgresql.conf is elsewhere. The original coding for ALTER SYSTEM made a fundamentally bogus assumption that postgresql.auto.conf could be sought relative to the main config file if we hadn't yet determined the value of data_directory. This fails for common arrangements with the config file elsewhere, as reported by Christoph Berg. The simplest fix is to not try to read postgresql.auto.conf until after SelectConfigFiles has chosen (and locked down) the data_directory setting. Because of the logic in ProcessConfigFile for handling resetting of GUCs that've been removed from the config file, we cannot easily read the main and auto config files separately; so this patch adopts a brute force approach of reading the main config file twice during postmaster startup. That's a tad ugly, but the actual time cost is likely to be negligible, and there's no time for a more invasive redesign before beta. With this patch, any attempt to set data_directory via ALTER SYSTEM will be silently ignored. It would probably be better to throw an error, but that can be dealt with later. This bug, however, would prevent any testing of ALTER SYSTEM by a significant fraction of the userbase, so it seems important to get it fixed before beta.	2014-05-11 15:13:30 -04:00
Tom Lane	12e611d43e	Rename jsonb_hash_ops to jsonb_path_ops. There's no longer much pressure to switch the default GIN opclass for jsonb, but there was still some unhappiness with the name "jsonb_hash_ops", since hashing is no longer a distinguishing property of that opclass, and anyway it seems like a relatively minor detail. At the suggestion of Heikki Linnakangas, we'll use "jsonb_path_ops" instead; that captures the important characteristic that each index entry depends on the entire path from the document root to the indexed value. Also add a user-facing explanation of the implementation properties of these two opclasses.	2014-05-11 12:06:04 -04:00
Peter Eisentraut	e136271a94	Translation updates	2014-05-10 22:16:59 -04:00
Tom Lane	0d0b2bf175	Rename min_recovery_apply_delay to recovery_min_apply_delay. Per discussion, this seems like a more consistent choice of name. Fabrízio de Royes Mello, after a suggestion by Peter Eisentraut; some additional documentation wordsmithing by me	2014-05-10 19:46:19 -04:00
Heikki Linnakangas	866e6e1d04	Fix bug in lossy-page handling in GIN When returning rows from a bitmap, as done with partial match queries, we would get stuck in an infinite loop if the bitmap contained a lossy page reference. This bug is new in master, it was introduced by the patch to allow skipping items refuted by other entries in GIN scans. Report and fix by Alexander Korotkov	2014-05-10 23:28:26 +03:00
Tom Lane	3d8c2b496f	Fix broken allocation logic in recently-rewritten jsonb_util.c. reserveFromBuffer() failed to consider the possibility that it needs to more-than-double the current buffer size. Beyond that, it seems likely that we'd someday need to worry about integer overflow of the buffer length variable. Rather than reinvent the logic that's already been debugged in stringinfo.c, let's go back to using that logic. We can still have the same targeted API, but we'll rely on stringinfo.c to manage reallocation. Per report from Alexander Korotkov.	2014-05-09 18:24:17 -04:00
Tom Lane	0ca6bda8e7	Get rid of bogus dependency on typcategory in to_json() and friends. These functions were relying on typcategory to identify arrays and composites, which is not reliable and not the normal way to do it. Using typcategory to identify boolean, numeric types, and json itself is also pretty questionable, though the code in those cases didn't seem to be at risk of anything worse than wrong output. Instead, use the standard lsyscache functions to identify arrays and composites, and rely on a direct check of the type OID for the other cases. In HEAD, also be sure to look through domains so that a domain is treated the same as its base type for conversions to JSON. However, this is a small behavioral change; given the lack of field complaints, we won't back-patch it. In passing, refactor so that there's only one copy of the code that decides which conversion strategy to apply, not multiple copies that could (and have) gotten out of sync.	2014-05-09 12:55:31 -04:00
Robert Haas	f1d8dd3647	Code review for logical decoding patch. Post-commit review identified a number of places where addition was used instead of multiplication or memory wasn't zeroed where it should have been. This commit also fixes one case where a structure member was mis-initialized, and moves another memory allocation closer to the place where the allocated storage is used for clarity. Andres Freund	2014-05-09 10:44:04 -04:00
Robert Haas	b2dada8f5f	Remove overeager assertion in logical_heap_begin_rewrite. It's legal to configure wal_level=logical and max_replication_slots=0 simultaneously. Andres Freund	2014-05-09 10:36:12 -04:00
Tom Lane	62e57ff040	Teach add_json() that jsonb is of TYPCATEGORY_JSON. This code really needs to be refactored so that there aren't so many copies that can diverge. Not to mention that this whole approach is probably wrong. But for the moment I'll just stick my finger in the dike. Per report from Michael Paquier.	2014-05-09 09:44:11 -04:00
Tom Lane	bdf9dd4db7	Fix typcategory labeling of jsonb. Dunno who had the cute idea of labeling jsonb as typcategory 'C', but it is not a composite type. Label it 'U', since that's what json is using.	2014-05-09 09:25:58 -04:00
Heikki Linnakangas	d9daff0e0c	More jsonb cleanup. Fix JSONB_MAX_ELEMS and JSONB_MAX_PAIRS macros to use CB_MASK in the calculation. JENTRY_POSMASK happens to have the same value at the moment, but that's just coincidental. Refactor jsonb iterator functions, for readability. Get rid of the JENTRY_ISFIRST flag. Whenever we handle JEntrys, we have access to the whole array and have enough context information to know which entry is the first. This frees up one bit in the JEntry header for future use. While we're at it, shuffle the JEntry bits so that boolean true and false go together, for aesthetic reasons. Bump catalog version as this changes the on-disk format slightly.	2014-05-09 15:55:56 +03:00
Tom Lane	46dddf7673	Improve key representation for GIN jsonb_ops, and fix existence-search bug. Change the key representation so that values that would exceed 127 bytes are hashed into short strings, and so that the original JSON datatype of each value is recorded in the index. The hashing rule eliminates the major objection to having this opclass be the default for jsonb, namely that it could fail for plausible input data (due to GIN's restrictions on maximum key length). Preserving datatype information doesn't really buy us much right now, but it requires no extra space compared to the previous way, and it might be useful later. Also, change the consistency-checking functions to request recheck for exists (jsonb ? text) and related operators. The original analysis that this is an exactly checkable query was incorrect, since the index does not preserve information about whether a key appears at top level in the indexed JSON object. Add a test case demonstrating the problem. Make some other, mostly cosmetic improvements to the code in jsonb_gin.c as well. catversion bump due to on-disk data format change in jsonb_ops indexes.	2014-05-09 08:41:26 -04:00
Heikki Linnakangas	ff7bbb0176	Minor cleanup of jsonb_util.c Move the functions around to group related functions together. Remove binequal argument from lengthCompareJsonbStringValue, moving that responsibility to lengthCompareJsonbPair. Fix typo in comment.	2014-05-09 13:09:59 +03:00
Heikki Linnakangas	d3c72e23df	Avoid some pnstrdup()s when constructing jsonb This speeds up text to jsonb parsing and hstore to jsonb conversions somewhat.	2014-05-09 12:46:21 +03:00
Tom Lane	14d309cc55	Fix missing dependencies in ecpg's test Makefiles. Ensure that ecpg preprocessor output files are rebuilt when re-testing after a change in the ecpg preprocessor itself, or a change in any of several include files that get copied verbatim into the output files. The lack of these dependencies was what created problems for Kevin Grittner after the recent pgindent run. There's no way for --enable-depend to discover these dependencies automatically, so we've gotta put them into the Makefiles by hand. While at it, reduce the amount of duplication in the ecpg invocations.	2014-05-08 22:34:51 -04:00
Tom Lane	b910d7ea35	Increase the default value of effective_cache_size to 4GB. Per discussion, the old value of 128MB is ridiculously small on modern machines; in fact, it's not even any larger than the default value of shared_buffers, which it certainly should be. Increase to 4GB, which is unlikely to be any worse than the old default for anyone, and should be noticeably better for most. Eventually we might have an autotuning scheme for this setting, but the recent attempt crashed and burned, so for now just do this.	2014-05-08 21:11:47 -04:00
Tom Lane	a16d421ca4	Revert "Auto-tune effective_cache size to be 4x shared buffers" This reverts commit `ee1e5662d8`, as well as a remarkably large number of followup commits, which were mostly concerned with the fact that the implementation didn't work terribly well. It still doesn't: we probably need some rather basic work in the GUC infrastructure if we want to fully support GUCs whose default varies depending on the value of another GUC. Meanwhile, it also emerged that there wasn't really consensus in favor of the definition the patch tried to implement (ie, effective_cache_size should default to 4 times shared_buffers). So whack it all back to where it was. In a followup commit, I'll do what was recently agreed to, which is to simply change the default to a higher value.	2014-05-08 20:49:38 -04:00
Noah Misch	08c8e8962f	Un-break ecpg test suite under --disable-integer-datetimes. Commit `4318daecc9` broke it. The change in sub-second precision at extreme dates is normal. The inconsistent truncation vs. rounding is essentially a bug, albeit a longstanding one. Back-patch to 8.4, like the causative commit.	2014-05-08 19:29:02 -04:00
Tom Lane	1e81f8462a	Fix comment. Previous commit was confused about the case we're handling: actually, what the patch is dealing with is platforms that have optreset, and have <getopt.h>, but the latter fails to declare the former. Because we use a linking probe to set HAVE_INT_OPTRESET, we need to be sure we have a declaration even if <getopt.h> doesn't think it exists.	2014-05-08 12:42:56 -04:00
Tom Lane	0c15a524c5	Allow for platforms that have optreset but not <getopt.h>. Reportedly, some versions of mingw are like that, and it seems plausible in general that older platforms might be that way. However, we'd determined experimentally that just doing "extern int" conflicts with the way Cygwin declares these variables, so explicitly exclude Cygwin. Michael Paquier, tweaked by me to hopefully not break Cygwin	2014-05-08 12:33:29 -04:00
Heikki Linnakangas	4f7bb4b2a3	Protect against torn pages when deleting GIN list pages. To-be-deleted list pages contain no useful information, as they are being deleted, but we must still protect the writes from being torn by a crash after a partial write. To do that, re-initialize the pages on WAL replay. Jeff Janes caught this with a test program to test partial writes. Backpatch to all supported versions.	2014-05-08 14:50:22 +03:00
Heikki Linnakangas	02c9a93805	Include files copied from libpqport in .gitignore Michael Paquier	2014-05-08 10:59:09 +03:00
Tom Lane	2f557167b1	Avoid buffer bloat in libpq when server is consistently faster than client. If the server sends a long stream of data, and the server + network are consistently fast enough to force the recv() loop in pqReadData() to iterate until libpq's input buffer is full, then upon processing the last incomplete message in each bufferload we'd usually double the buffer size, due to supposing that we didn't have enough room in the buffer to finish collecting that message. After filling the newly-enlarged buffer, the cycle repeats, eventually resulting in an out-of-memory situation (which would be reported misleadingly as "lost synchronization with server"). Of course, we should not enlarge the buffer unless we still need room after discarding already-processed messages. This bug dates back quite a long time: pqParseInput3 has had the behavior since perhaps 2003, getCopyDataMessage at least since commit `70066eb1a1` in 2008. Probably the reason it's not been isolated before is that in common environments the recv() loop would always be faster than the server (if on the same machine) or faster than the network (if not); or at least it wouldn't be slower consistently enough to let the buffer ramp up to a problematic size. The reported cases involve Windows, which perhaps has different timing behavior than other platforms. Per bug #7914 from Shin-ichi Morita, though this is different from his proposed solution. Back-patch to all supported branches.	2014-05-07 21:39:13 -04:00
Robert Haas	be7558162a	When a background worker exists with code 0, unregister it. The previous behavior was to restart immediately, which was generally viewed as less useful. Petr Jelinek, with some adjustments by me.	2014-05-07 17:44:42 -04:00
Robert Haas	eee6cf1f33	When a bgworker exits, always call ReleasePostmasterChildSlot. Commit `e2ce9aa27b` was insufficiently well thought out. Repair.	2014-05-07 16:30:23 -04:00
Robert Haas	970d1f76d1	Restart bgworkers immediately after a crash-and-restart cycle. Just as we would start bgworkers immediately after an initial startup of the server, we should restart them immediately when reinitializing. Petr Jelinek and Robert Haas	2014-05-07 16:19:35 -04:00
Heikki Linnakangas	364ddc3e5c	Clean up jsonb code. The main target of this cleanup is the convertJsonb() function, but I also touched a lot of other things that I spotted into in the process. The new convertToJsonb() function uses an output buffer that's resized on demand, so the code to estimate of the size of JsonbValue is removed. The on-disk format was not changed, even though I refactored the structs used to handle it. The term "superheader" is replaced with "container". The jsonb_exists_any and jsonb_exists_all functions no longer sort the input array. That was a premature optimization, the idea being that if there are duplicates in the input array, you only need to check them once. Also, sorting the array saves some effort in the binary search used to find a key within an object. But there were drawbacks too: the sorting and deduplicating obviously isn't free, and in the typical case there are no duplicates to remove, and the gain in the binary search was minimal. Remove all that, which makes the code simpler too. This includes a bug-fix; the total length of the elements in a jsonb array or object mustn't exceed 2^28. That is now checked.	2014-05-07 23:16:19 +03:00
Robert Haas	4d155d8b08	Detach shared memory from bgworkers without shmem access. Since the postmaster won't perform a crash-and-restart sequence for background workers which don't request shared memory access, we'd better make sure that they can't corrupt shared memory. Patch by me, review by Tom Lane.	2014-05-07 14:56:49 -04:00
Tom Lane	04e5025be8	Fix failure to set ActiveSnapshot while rewinding a cursor. ActiveSnapshot needs to be set when we call ExecutorRewind because some plan node types may execute user-defined functions during their ReScan calls (nodeLimit.c does so, at least). The wisdom of that is somewhat debatable, perhaps, but for now the simplest fix is to make sure the required context is valid. Failure to do this typically led to a null-pointer-dereference core dump, though it's possible that in more complex cases a function could be executed with the wrong snapshot leading to very subtle misbehavior. Per report from Leif Jensen. It's been broken for a long time, so back-patch to all active branches.	2014-05-07 14:25:11 -04:00
Robert Haas	e2ce9aa27b	Never crash-and-restart for bgworkers without shared memory access. The motivation for a crash and restart cycle when a backend dies is that it might have corrupted shared memory on the way down; and we can't recover reliably except by reinitializing everything. But that doesn't apply to processes that don't touch shared memory. Currently, there's nothing to prevent a background worker that doesn't request shared memory access from touching shared memory anyway, but that's a separate bug. Previous to this commit, the coding in postmaster.c was inconsistent: an exit status other than 0 or 1 didn't provoke a crash-and-restart, but failure to release the postmaster child slot did. This change makes those cases consistent.	2014-05-07 13:19:02 -04:00
Tom Lane	1891b415f0	Fix some more confusion between uint32 and Datum.	2014-05-06 23:52:30 -04:00
Jeff Davis	348aa75a67	Fix interval test, which was broken for floating-point timestamps. Commit `4318daecc9` introduced a test that couldn't be made consistent between integer and floating-point timestamps. It was designed to test the longest possible interval output length, so removing four zeros from the number of hours, as this patch does, is not ideal. But the test still has some utility for its original purpose, and there aren't a lot of other good options. Noah Misch suggested a different approach where we test that the output either matches what we expect from integer timestamps or what we expect from floating-point timestamps. That seemed to obscure an otherwise simple test, however. Reviewed by Tom Lane and Noah Misch.	2014-05-06 19:53:59 -07:00
Tom Lane	2c22afaa4e	hash_any returns Datum, not uint32 (and definitely not "int"). The coding in JsonbHashScalarValue might have accidentally failed to fail given current representational choices, but the key word there would be "accidental". Insert the appropriate datatype conversion macro. And use the right conversion macro for hash_numeric's result, too. In passing make the code a bit cleaner and less repetitive by factoring out the xor step from the switch.	2014-05-06 22:49:40 -04:00
Jeff Davis	35c0cd3b05	Improve comment for tricky aspect of index-only scans. Index-only scans avoid taking a lock on the VM buffer, which would cause a lot of contention. To be correct, that requires some intricate assumptions that weren't completely documented in the previous comment. Reviewed by Robert Haas.	2014-05-06 19:27:43 -07:00
Bruce Momjian	84288a86ac	With ecpg exclusion removed, re-run pgindent for 9.4 Report by Tom Lane	2014-05-06 20:39:28 -04:00
Bruce Momjian	9516668e48	Remove pgindent ecpg exclusion pattern Report by Tom Lane	2014-05-06 20:09:31 -04:00
Simon Riggs	250f259a44	pg_basebackup streaming: adjust version check msg Allow for translatable string, rather than use "or"	2014-05-06 22:50:06 +01:00
Bruce Momjian	7c7b1f4ae5	Improve pgindent test instructions	2014-05-06 15:33:38 -04:00
Robert Haas	e0124230ba	Fix logic bug in dsm_attach(). The previous coding would potentially cause attaching to segment A to fail if segment B was at the same time in the process of going away. Andres Freund, with a comment tweak by me	2014-05-06 13:40:34 -04:00
Bruce Momjian	4335c95815	Fix improperly passed file descriptors Fix for commit `14ea89366f` Report by Andres Freund	2014-05-06 12:20:51 -04:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Bruce Momjian	fb85cd4320	Adjust pgindent to remove tabs after periods in C comments.	2014-05-06 10:57:15 -04:00
Bruce Momjian	55d5ff825f	Fix detection of short tar files, broken by commit `14ea89366f` Report by Noah Misch	2014-05-06 10:01:20 -04:00
Simon Riggs	2e54d88af1	Correct comment in Hot Standby nbtree handling Logic is correct, matching handling of LP_DEAD elsewhere.	2014-05-06 14:44:18 +01:00
Bruce Momjian	284c464b9f	Update typedef list in preparation for pgindent run	2014-05-06 09:08:14 -04:00
Simon Riggs	08317edc2d	pg_basebackup streaming: adjust version check msg Commit `d298b50a3b` by Heikki Linnakangas requested that the version check message be updated at next release, suggesting that the appropriate text would be “9.3 or later”. The logic used for the check indicates that the correct text for 9.4 is “9.3 or 9.4”, since the logic would cause this to fail for later releases.	2014-05-06 13:44:15 +01:00
Heikki Linnakangas	3a8e9e977f	Fix use of free in walsender error handling after a sysid mismatch. Found via valgrind. The bug exists since the introduction of the walsender, so backpatch to 9.0. Andres Freund	2014-05-06 15:17:41 +03:00
Michael Meskes	8d6a07fa01	Fix handling of array of char pointers in ecpglib. When array of char * was used as target for a FETCH statement returning more than one row, it tried to store all the result in the first element. Instead it should dump array of char pointers with right offset, use the address instead of the value of the C variable while reading the array and treat such variable as char *, instead of char for pointer arithmetic. Patch by Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>	2014-05-06 13:09:51 +02:00
Bruce Momjian	14ea89366f	Properly detect read and write errors in pg_dump/dumpall, and pg_restore Previously some I/O errors were ignored.	2014-05-05 20:27:16 -04:00
Tom Lane	0f928a85ec	Fix possible cache invalidation failure in ReceiveSharedInvalidMessages. Commit `fad153ec45` modified sinval.c to reduce the number of calls into sinvaladt.c (which require taking a shared lock) by keeping a local buffer of collected-but-not-yet-processed messages. However, if processing of the last message in a batch resulted in a recursive call to ReceiveSharedInvalidMessages, we could overwrite that message with a new one while the outer invalidation function was still working on it. This would be likely to lead to invalidation of the wrong cache entry, allowing subsequent processing to use stale cache data. The fix is just to make a local copy of each message while we're processing it. Spotted by Andres Freund. Back-patch to 8.4 where the bug was introduced.	2014-05-05 14:43:39 -04:00
Tom Lane	3727afafee	Fix pg_type.typlen for newly-revived line type. Commit `261c7d4b65` removed the "m" field from struct LINE, but neglected to make pg_type.h's idea of the type's size match. This resulted in reading past the end of palloc'd LINE values when inserting them into tuples etc. In principle that could cause a SIGSEGV, though the odds of detectable problems seem low. Bump catversion since this makes an incompatible on-disk format change. Note that if the line type had been in use in the field, this would break pg_upgrade'ability of databases containing line values; but it seems unlikely that there are any (they'd have had to be compiled with -DENABLE_LINE_TYPE). Spotted by Andres Freund.	2014-05-05 13:37:54 -04:00
Tom Lane	e03485ae8a	Fix case of pg_dump -Fc to an unseekable file (such as a pipe). This was accidentally broken in commits cfa1b4a711/5e8e794e3b. It saves a line or so to call ftello unconditionally in _CloseArchive, but we have to expect that it might fail if we're not in hasSeek mode. Per report from Bernd Helmle. In passing, improve _getFilePos to print an appropriate message if ftello fails unexpectedly, rather than just a vague complaint about "ftell mismatch".	2014-05-05 11:26:41 -04:00
Heikki Linnakangas	377790fbd7	Pass sensible value to memset() when randomizing reorderbuffer's tuple slab. This is entirely harmless, but still wrong. Noticed by coverity. Andres Freund	2014-05-05 16:22:15 +03:00
Heikki Linnakangas	329de9fa98	Don't leak memory after connection aborts in pg_recvlogical. Andres Freund, noticed by coverity.	2014-05-05 16:20:12 +03:00
Heikki Linnakangas	c834576839	Use Size instead of uint32 to store result of sizeof() Silences coverity and is more consistent with other functions in the same file. Andres Freund	2014-05-05 16:17:16 +03:00
Heikki Linnakangas	1460b199e6	Assert that pre/post-fix updated tuples are on the same page during replay. If they were not 'oldtup.t_data' would be dereferenced while set to NULL in case of a full page image for block 0. Do so primarily to silence coverity; but also to make sure this prerequisite isn't changed without adapting the replay routine as that would appear to work in many cases. Andres Freund	2014-05-05 16:15:25 +03:00
Heikki Linnakangas	a692ee5870	Replace SYSTEMQUOTEs with Windows-specific wrapper functions. It's easy to forget using SYSTEMQUOTEs when constructing command strings for system() or popen(). Even if we fix all the places missing it now, it is bound to be forgotten again in the future. Introduce wrapper functions that do the the extra quoting for you, and get rid of SYSTEMQUOTEs in all the callers. We previosly used SYSTEMQUOTEs in all the hard-coded command strings, and this doesn't change the behavior of those. But user-supplied commands, like archive_command, restore_command, COPY TO/FROM PROGRAM calls, as well as pgbench's \shell, will now gain an extra pair of quotes. That is desirable, but if you have existing scripts or config files that include an extra pair of quotes, those might need to be adjusted. Reviewed by Amit Kapila and Tom Lane	2014-05-05 16:07:40 +03:00
Tom Lane	91e16b9806	Fix yet another corner case in dumping rules/views with USING clauses. ruleutils.c tries to cope with additions/deletions/renamings of columns in tables referenced by views, by means of adding machine-generated aliases to the printed form of a view when needed to preserve the original semantics. A recent blog post by Marko Tiikkaja pointed out a case I'd missed though: if one input of a join with USING is itself a join, there is nothing to stop the user from adding a column of the same name as the USING column to whichever side of the sub-join didn't provide the USING column. And then there'll be an error when the view is re-parsed, since now the sub-join exposes two columns matching the USING specification. We were catching a lot of related cases, but not this one, so add some logic to cope with it. Back-patch to 9.3, which is the first release that makes any serious attempt to cope with such cases (cf commit `2ffa740be` and follow-ons).	2014-05-01 20:22:37 -04:00
Tom Lane	3f8c8e3c61	Fix failure to detoast fields in composite elements of structured types. If we have an array of records stored on disk, the individual record fields cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are only prepared to deal with TOAST pointers appearing in top-level fields of a stored row. The same applies for ranges over composite types, nested composites, etc. However, the existing code only took care of expanding sub-field TOAST pointers for the case of nested composites, not for other structured types containing composites. For example, given a command such as UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ... where x is a direct reference to a field of an on-disk tuple, if that field is long enough to be toasted out-of-line then the TOAST pointer would be inserted as-is into the array column. If the source record for x is later deleted, the array field value would become a dangling pointer, leading to errors along the line of "missing chunk number 0 for toast value ..." when the value is referenced. A reproducible test case for this was provided by Jan Pecek, but it seems likely that some of the "missing chunk number" reports we've heard in the past were caused by similar issues. Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to produce a self-contained Datum value if the Datum is of composite type. Seen in this light, the problem is not just confined to arrays and ranges, but could also affect some other places where detoasting is done in that way, for example form_index_tuple(). I tried teaching the array code to apply toast_flatten_tuple_attribute() along with PG_DETOAST_DATUM() when the array element type is composite, but this was messy and imposed extra cache lookup costs whether or not any TOAST pointers were present, indeed sometimes when the array element type isn't even composite (since sometimes it takes a typcache lookup to find that out). The idea of extending that approach to all the places that currently use PG_DETOAST_DATUM() wasn't attractive at all. This patch instead solves the problem by decreeing that composite Datum values must not contain any out-of-line TOAST pointers in the first place; that is, we expand out-of-line fields at the point of constructing a composite Datum, not at the point where we're about to insert it into a larger tuple. This rule is applied only to true composite Datums, not to tuples that are being passed around the system as tuples, so it's not as invasive as it might sound at first. With this approach, the amount of code that has to be touched for a full solution is greatly reduced, and added cache lookup costs are avoided except when there actually is a TOAST pointer that needs to be inlined. The main drawback of this approach is that we might sometimes dereference a TOAST pointer that will never actually be used by the query, imposing a rather large cost that wasn't there before. On the other side of the coin, if the field value is used multiple times then we'll come out ahead by avoiding repeat detoastings. Experimentation suggests that common SQL coding patterns are unaffected either way, though. Applications that are very negatively affected could be advised to modify their code to not fetch columns they won't be using. In future, we might consider reverting this solution in favor of detoasting only at the point where data is about to be stored to disk, using some method that can drill down into multiple levels of nested structured types. That will require defining new APIs for structured types, though, so it doesn't seem feasible as a back-patchable fix. Note that this patch changes HeapTupleGetDatum() from a macro to a function call; this means that any third-party code using that macro will not get protection against creating TOAST-pointer-containing Datums until it's recompiled. The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER(). It seems likely that this is not a big problem in practice: most of the tuple-returning functions in core and contrib produce outputs that could not possibly be toasted anyway, and the same probably holds for third-party extensions. This bug has existed since TOAST was invented, so back-patch to all supported branches.	2014-05-01 15:19:06 -04:00
Tom Lane	203b0d132f	Improve error messages in reorderbuffer.c. Be more clear about failure cases in relfilenode->relation lookup, and fix some other places that were inconsistent or not per our message style guidelines. Andres Freund and Tom Lane	2014-04-30 18:16:53 -04:00
Robert Haas	5ec45bb7fa	Consistently allow reading of messages from a detached shm_mq. This was intended to work always, but the previous code only allowed it if at least one message was successfully read by the receiver before the sender detached the queue. Report by Petr Jelinek. Patch by me.	2014-04-30 17:38:18 -04:00
Tom Lane	2d00190495	Rationalize common/relpath.[hc]. Commit `a730183926` created rather a mess by putting dependencies on backend-only include files into include/common. We really shouldn't do that. To clean it up: * Move TABLESPACE_VERSION_DIRECTORY back to its longtime home in catalog/catalog.h. We won't consider this symbol part of the FE/BE API. * Push enum ForkNumber from relfilenode.h into relpath.h. We'll consider relpath.h as the source of truth for fork numbers, since relpath.c was already partially serving that function, and anyway relfilenode.h was kind of a random place for that enum. * So, relfilenode.h now includes relpath.h rather than vice-versa. This direction of dependency is fine. (That allows most, but not quite all, of the existing explicit #includes of relpath.h to go away again.) * Push forkname_to_number from catalog.c to relpath.c, just to centralize fork number stuff a bit better. * Push GetDatabasePath from catalog.c to relpath.c; it was rather odd that the previous commit didn't keep this together with relpath(). * To avoid needing relfilenode.h in common/, redefine the underlying function (now called GetRelationPath) as taking separate OID arguments, and make the APIs using RelFileNode or RelFileNodeBackend into macro wrappers. (The macros have a potential multiple-eval risk, but none of the existing call sites have an issue with that; one of them had such a risk already anyway.) * Fix failure to follow the directions when "init" fork type was added; specifically, the errhint in forkname_to_number wasn't updated, and neither was the SGML documentation for pg_relation_size(). * Fix tablespace-path-too-long check in CreateTableSpace() to account for fork-name component of maximum-length pathnames. This requires putting FORKNAMECHARS into a header file, but it was rather useless (and actually unreferenced) where it was. The last couple of items are potentially back-patchable bug fixes, if anyone is sufficiently excited about them; but personally I'm not. Per a gripe from Christoph Berg about how include/common wasn't self-contained.	2014-04-30 17:30:50 -04:00
Tom Lane	0bff398761	Check for interrupts and stack overflow during rule/view dumps. Since ruleutils.c recurses, it could be driven to stack overflow by deeply nested constructs. Very large queries might also take long enough to deparse that a check for interrupts seems like a good idea. Stick appropriate tests into a couple of key places. Noted by Greg Stark. Back-patch to all supported branches.	2014-04-30 13:46:13 -04:00
Tom Lane	41de93c53a	Reduce indentation/parenthesization of set operations in rule/view dumps. A query such as "SELECT x UNION SELECT y UNION SELECT z UNION ..." produces a left-deep nested parse tree, which we formerly showed in its full nested glory and with all the possible parentheses. This does little for readability, though, and long UNION lists resulting in excessive indentation are common. Instead, let's omit parentheses and indent all the subqueries at the same level in such cases. This patch skips indentation/parenthesization whenever the lefthand input of a SetOperationStmt is another SetOperationStmt of the same kind and ALL/DISTINCT property. We could teach the code the exact syntactic precedence of set operations and thereby avoid parenthesization in some more cases, but it's not clear that that'd be a readability win: it seems better to parenthesize if the set operation changes. (As an example, if there's one UNION in a long list of UNION ALL, it now stands out like a sore thumb, which seems like a good thing.) Back-patch to 9.3. This completes our response to a complaint from Greg Stark that since commit `62e666400d` there's a performance problem in pg_dump for views containing long UNION sequences (or other types of deeply nested constructs). The previous commit `0601cb54da` handles the general problem, but this one makes the specific case of UNION lists look a lot nicer.	2014-04-30 13:26:26 -04:00
Tom Lane	0601cb54da	Limit overall indentation in rule/view dumps. Continuing to indent no matter how deeply nested we get doesn't really do anything for readability; what's worse, it results in O(N^2) total whitespace, which can become a performance and memory-consumption issue. To address this, once we get past 40 characters of indentation, reduce the indentation step distance 4x, and also limit the maximum indentation by reducing it modulo 40. This latter choice is a bit weird at first glance, but it seems to preserve readability better than a simple cap would do. Back-patch to 9.3, because since commit `62e666400d` the performance issue is a hazard for pg_dump. Greg Stark and Tom Lane	2014-04-30 12:48:12 -04:00
Tom Lane	d166eed302	Fix indentation of JOIN clauses in rule/view dumps. The code attempted to outdent JOIN clauses further left than the parent FROM keyword, which was odd in any case, and led to inconsistent formatting since in simple cases the clauses couldn't be moved any further left than that. And it left a permanent decrement of the indentation level, causing subsequent lines to be much further left than they should be (again, this couldn't be seen in simple cases for lack of indentation to give up). After a little experimentation I chose to make it indent JOIN keywords two spaces from the parent FROM, which is one space more than the join's lefthand input in cases where that appears on a different line from FROM. Back-patch to 9.3. This is a purely cosmetic change, and the bug is quite old, so that may seem arbitrary; but we are going to be making some other changes to the indentation behavior in both HEAD and 9.3, so it seems reasonable to include this in 9.3 too. I committed this one first because its effects are more visible in the regression test results as they currently stand than they will be later.	2014-04-30 12:01:19 -04:00
Tom Lane	5358bfdc98	Fix uninitialized-variable warnings induced by recent commit.	2014-04-30 11:15:15 -04:00
Heikki Linnakangas	503de54621	Add missing SYSTEMQUOTEs Some popen() calls were missing SYSTEMQUOTEs, which caused initdb and pg_upgrade to fail on Windows, if the installation path contained both spaces and @ signs. Patch by Nikhil Deshpande. Backpatch to all supported versions.	2014-04-30 10:35:52 +03:00
Peter Eisentraut	d0765d50f4	PL/Python: Adjust the regression tests for Python 3.4 The error test case in the plpython_do test resulted in a slightly different error message with Python 3.4. So pick a different way to test it that avoids that and is perhaps also a bit clearer.	2014-04-29 22:16:16 -04:00
Peter Eisentraut	322173eb0a	Fix whitespace	2014-04-29 21:35:07 -04:00
Tom Lane	95811032d7	Improve planner to drop constant-NULL inputs of AND/OR where it's legal. In general we can't discard constant-NULL inputs, since they could change the result of the AND/OR to be NULL. But at top level of WHERE, we do not need to distinguish a NULL result from a FALSE result, so it's okay to treat NULL as FALSE and then simplify AND/OR accordingly. This is a very ancient oversight, but in 9.2 and later it can lead to failure to optimize queries that previous releases did optimize, as a result of more aggressive parameter substitution rules making it possible to reduce more subexpressions to NULL constants. This is the root cause of bug #10171 from Arnold Scheffler. We could alternatively have fixed that by teaching orclauses.c to ignore constant-NULL OR arms, but it seems better to get rid of them globally. I resisted the temptation to back-patch this change into all active branches, but it seems appropriate to back-patch as far as 9.2 so that there will not be performance regressions of the kind shown in this bug.	2014-04-29 13:12:46 -04:00
Greg Stark	dbe31616c9	Remove unnecessary cast causing a warning Incidentally, I reversed the two names in the earlier commit. The original author was Sergey Muraviov and the reviewer was Emre Hasegeli.	2014-04-29 12:43:03 +01:00
Greg Stark	6513633b94	Add support for wrapping to psql's "extended" mode. This makes it very feasible to display tables that have both many columns and some large data in some columns (such as pg_stats). Emre Hasegeli with review and rewriting from Sergey Muraviov and reviewed by Greg Stark	2014-04-28 18:41:36 +01:00
Heikki Linnakangas	d2722443d9	Fix two bugs in WAL-logging of GIN pending-list pages. In writeListPage, never take a full-page image of the page, because we have all the information required to re-initialize in the WAL record anyway. Before this fix, a full-page image was always generated, unless full_page_writes=off, because when the page is initialized its LSN is always 0. In stable-branches, keep the code to restore the backup blocks if they exist, in case that the WAL is generated with an older minor version, but in master Assert that there are no full-page images. In the redo routine, add missing "off++". Otherwise the tuples are added to the page in reverse order. That happens to be harmless because we always scan and remove all the tuples together, but it was clearly wrong. Also, it was masked by the first bug unless full_page_writes=off, because the page was always restored from a full-page image. Backpatch to all supported versions.	2014-04-28 17:31:01 +03:00
Tom Lane	a9baeb361d	Can't completely get rid of #ifndef FRONTEND in palloc.h :-( pg_controldata includes postgres.h not postgres_fe.h, so utils/palloc.h must be able to compile in a "#define FRONTEND" context. It appears that Solaris Studio is smart enough to persuade us to define PG_USE_INLINE, but not smart enough to not make a copy of unreferenced static functions; which leads to an unsatisfied reference to CurrentMemoryContext. So we need an #ifndef FRONTEND around that declaration. Per buildfarm.	2014-04-27 21:24:19 -04:00
Tom Lane	5035701e07	Improve generation algorithm for database system identifier. As noted some time ago, the original coding had a typo ("\|" for "^") that made the result less unique than intended. Even the intended behavior is obsolete since it was based on wanting to produce a usable value even if we didn't have int64 arithmetic --- a limitation we stopped supporting years ago. Instead, let's redefine the system identifier as tv_sec in the upper 32 bits (same as before), tv_usec in the next 20 bits, and the low 12 bits of getpid() in the remaining bits. This is still hardly guaranteed-universally-unique, but it's noticeably better than before. Per my proposal at <29019.1374535940@sss.pgh.pa.us>	2014-04-26 15:11:10 -04:00
Tom Lane	528c454b2a	Don't #include utils/palloc.h in common/fe_memutils.h. This breaks the principle that common/ ought not depend on anything in the server, not only code-wise but in the headers. The only arguable advantage is avoidance of duplication of half a dozen extern declarations, and even that is rather dubious, considering that the previous coding was wrong about which declarations to duplicate: it exposed pnstrdup() to frontend code even though no such function is provided in fe_memutils.c. On the same principle, don't #include utils/memutils.h in the frontend build of psprintf.c. This requires duplicating the definition of MaxAllocSize, but that seems fine to me: there's no a-priori reason why frontend code should use the same size limit as the backend anyway. In passing, clean up some rather odd layout and ordering choices that were imposed on palloc.h to reduce the number of #ifdefs required by the previous approach. Per gripe from Christoph Berg. There's still more work to do to make include/common/ clean, but this part seems reasonably noncontroversial.	2014-04-26 14:14:28 -04:00
Tom Lane	39b0c7681e	Record the proper typmod for an index expression column. We should use exprTypmod() to extract the typmod of the expression, instead of just blindly storing -1. This seems to have been an aboriginal oversight in commit `fc8d970cbc` which introduced general-expression indexes. The consequences are only cosmetic at present, since the index machinery doesn't really look at typmod for index columns; but still it seems best to describe the column type as precisely as we can. Per off-list complaint from Thomas Fanghaenel.	2014-04-26 12:22:09 -04:00
Tom Lane	4bfc5f1396	Fix off-by-one bug in LWLockRegisterTranche(). Original coding failed to enlarge the array as required if the requested tranche_id was equal to LWLockTranchesAllocated. In passing, fix poor style of not casting the result of (re)palloc.	2014-04-25 15:59:57 -04:00
Tom Lane	49137ec9d4	Clean up temp installations after client program tests. Commit `7d0f493f19` added infrastructure to perform tests in assorted src/bin/ subdirectories, but forgot to teach "make clean" to clean up the detritus the tests leave behind.	2014-04-25 15:40:35 -04:00
Alvaro Herrera	1a917ae861	Fix race when updating a tuple concurrently locked by another process If a tuple is locked, and this lock is later upgraded either to an update or to a stronger lock, and in the meantime some other process tries to lock, update or delete the same tuple, it (the tuple) could end up being updated twice, or having conflicting locks held. The reason for this is that the second updater checks for a change in Xmax value, or in the HEAP_XMAX_IS_MULTI infomask bit, after noticing the first lock; and if there's a change, it restarts and re-evaluates its ability to update the tuple. But it neglected to check for changes in lock strength or in lock-vs-update status when those two properties stayed the same. This would lead it to take the wrong decision and continue with its own update, when in reality it shouldn't do so but instead restart from the top. This could lead to either an assertion failure much later (when a multixact containing multiple updates is detected), or duplicate copies of tuples. To fix, make sure to compare the other relevant infomask bits alongside the Xmax value and HEAP_XMAX_IS_MULTI bit, and restart from the top if necessary. Also, in the belt-and-suspenders spirit, add a check to MultiXactCreateFromMembers that a multixact being created does not have two or more members that are claimed to be updates. This should protect against other bugs that might cause similar bogus situations. Backpatch to 9.3, where the possibility of multixacts containing updates was introduced. (In prior versions it was possible to have the tuple lock upgraded from shared to exclusive, and an update would not restart from the top; yet we're protected against a bug there because there's always a sleep to wait for the locking transaction to complete before continuing to do anything. Really, the fact that tuple locks always conflicted with concurrent updates is what protected against bugs here.) Per report from Andrew Dunstan and Josh Berkus in thread at http://www.postgresql.org/message-id/534C8B33.9050807@pgexperts.com Bug analysis by Andres Freund.	2014-04-24 15:41:55 -03:00
Tom Lane	d19bd29f07	Reset pg_stat_activity.xact_start during PREPARE TRANSACTION. Once we've completed a PREPARE, our session is not running a transaction, so its entry in pg_stat_activity should show xact_start as null, rather than leaving the value as the start time of the now-prepared transaction. I think possibly this oversight was triggered by faulty extrapolation from the adjacent comment that says PrepareTransaction should not call AtEOXact_PgStat, so tweak the wording of that comment. Noted by Andres Freund while considering bug #10123 from Maxim Boguk, although this error doesn't seem to explain that report. Back-patch to all active branches.	2014-04-24 13:29:48 -04:00
Magnus Hagander	b2c9b161b8	Properly build pg_recvlogical in the msvc build system Michael Paquier	2014-04-24 09:31:29 +02:00
Tom Lane	a0f9358149	Fix incorrect pg_proc.proallargtypes entries for two built-in functions. pg_sequence_parameters() and pg_identify_object() have had incorrect proallargtypes entries since 9.1 and 9.3 respectively. This was mostly masked by the correct information in proargtypes, but a few operations such as pg_get_function_arguments() (and thus psql's \df display) would show the wrong data types for these functions' input parameters. In HEAD, fix the wrong info, bump catversion, and add an opr_sanity regression test to catch future mistakes of this sort. In the back branches, just fix the wrong info so that installations initdb'd with future minor releases will have the right data. We can't force an initdb, and it doesn't seem like a good idea to add a regression test that will fail on existing installations. Andres Freund	2014-04-23 21:21:05 -04:00
Tom Lane	f0fedfe82c	Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.	2014-04-23 19:17:41 -04:00
Peter Eisentraut	c18cc0034e	ecpg: Add additional files to .gitignore These are test files added by `f917968537`.	2014-04-23 13:30:36 -04:00
Heikki Linnakangas	a4ad9afec2	Update obsolete comments. We no longer have a TLI field in the page header.	2014-04-23 14:41:51 +03:00
Heikki Linnakangas	8fbfbf1472	Fix typos in comment.	2014-04-23 12:56:41 +03:00
Heikki Linnakangas	4fafc4ecd9	Cleanup of new b-tree page deletion code. When marking a branch as half-dead, a pointer to the top of the branch is stored in the leaf block's hi-key. During normal operation, the high key was left in place, and the block number was just stored in the ctid field of the high key tuple, but in WAL replay, the high key was recreated as a truncated tuple with zero columns. For the sake of easier debugging, also truncate the tuple in normal operation, so that the page is identical after WAL replay. Also, rename the 'downlink' field in the WAL record to 'topparent', as that seems like a more descriptive name. And make sure it's set to invalid when unlinking the leaf page.	2014-04-23 10:19:54 +03:00
Tom Lane	d26b042ce5	Fix documentation of FmgrInfo.fn_nargs. Some ancient comments claimed that fn_nargs could be -1 to indicate a variable number of input arguments; but this was never implemented, and is at variance with what we ultimately did with "variadic" functions. Update the comments.	2014-04-22 23:22:12 -04:00
Tom Lane	c6a4ace5bf	Fix broken logic in logical_heap_rewrite_flush_mappings(). It's blatantly obvious that commit `4d0d607a45` wasn't tested. The leak's real enough, though.	2014-04-22 22:33:35 -04:00
Bruce Momjian	cee850c403	revert `4d0d607a45` Revert due to contrib/test_decoding regression failure	2014-04-22 22:21:54 -04:00
Bruce Momjian	19fa6161dd	build: add EXTRA_REGRESS_OPTS to all pg_regress invocations Patch by Christoph Berg	2014-04-22 18:13:10 -04:00
Bruce Momjian	4d0d607a45	release memory used while flushing logical mappings Patch by Ants Aasma	2014-04-22 18:05:44 -04:00
Bruce Momjian	2985e16031	regression test: fix hot standby tests by using repeatable read Serializable transactions won't work on a Hot Standby. Also fix VACUUM/ANALYZE label mixup. Patch by Martín Marqués	2014-04-22 17:23:58 -04:00
Bruce Momjian	7ec73783d8	copy: update docs for FORCE_NULL and FORCE_NOT_NULL combination Also update regression tests Patch by Michael Paquier	2014-04-22 16:06:37 -04:00
Heikki Linnakangas	4a5d55ec2b	Fix bug in the new B-tree incomplete-split code. Forgot to update LSN of left sibling's page, when creating a new root. I fixed this for regular insertions and page splits earlier, but missed new root creation.	2014-04-22 22:40:44 +03:00
Heikki Linnakangas	45e67a2ad7	Fix Gin README. The README incorrectly claimed that GIN posting tree pages contain an array of uncompressed items in addition to compressed posting lists. Earlier versions of the GIN posting list compression patch worked that way, but not the one that was committed.	2014-04-22 22:39:50 +03:00
Heikki Linnakangas	77fe2b6d79	Fix bug in new B-tree page deletion code. When modifying a page, must hold an exclusive lock. A shared lock is obviously not good enough.	2014-04-22 15:34:54 +03:00
Heikki Linnakangas	7e30c186da	Retain original physical order of tuples in redo of b-tree splits. It makes no difference to the system, but minimizing the differences between a master and standby makes debugging simpler.	2014-04-22 13:03:37 +03:00
Heikki Linnakangas	7d98054f0d	Fix rm_desc routine of b-tree page delete records. A couple of typos from my refactoring of the page deletion patch.	2014-04-22 13:02:52 +03:00
Heikki Linnakangas	8d34f68628	Avoid transient bogus page contents when creating a sequence. Don't use simple_heap_insert to insert the tuple to a sequence relation. simple_heap_insert creates a heap insertion WAL record, and replaying that will create a regular heap page without the special area containing the sequence magic constant, which is wrong for a sequence. That was not a bug because we always created a sequence WAL record after that, and replaying that overwrote the bogus heap page, and the transient state could never be seen by another backend because it was only done when creating a new sequence relation. But it's simpler and cleaner to avoid that in the first place.	2014-04-22 10:40:23 +03:00
Robert Haas	602b27ab8e	Fix another typo. Etsuro Fujita	2014-04-20 16:32:57 +02:00
Robert Haas	fab6170cab	Fix typo. Etsuro Fujita	2014-04-20 16:30:55 +02:00
Bruce Momjian	13ecb822e8	libpq: have PQconnectdbParams() and PQpingParams accept "" as default Previously, these functions treated "" optin values as defaults in some ways, but not in others, like when comparing to .pgpass. Also, add documentation to clarify that now "" and NULL use defaults, like PQsetdbLogin() has always done. BACKWARD INCOMPATIBILITY Patch by Adrian Vondendriesch, docs by me Report by Jeff Janes	2014-04-19 08:41:51 -04:00
Magnus Hagander	66b1084e2c	Fix typo Amit Langote	2014-04-18 12:49:54 +02:00
Peter Eisentraut	e7128e8dbb	Create function prototype as part of PG_FUNCTION_INFO_V1 macro Because of gcc -Wmissing-prototypes, all functions in dynamically loadable modules must have a separate prototype declaration. This is meant to detect global functions that are not declared in header files, but in cases where the function is called via dfmgr, this is redundant. Besides filling up space with boilerplate, this is a frequent source of compiler warnings in extension modules. We can fix that by creating the function prototype as part of the PG_FUNCTION_INFO_V1 macro, which such modules have to use anyway. That makes the code of modules cleaner, because there is one less place where the entry points have to be listed, and creates an additional check that functions have the right prototype. Remove now redundant prototypes from contrib and other modules.	2014-04-18 00:03:19 -04:00
Tom Lane	0156315823	Fix unused-variable warning on Windows. Introduced in `585bca39`: msgid is not used in the Windows code path. Also adjust comments a tad (mostly to keep pgindent from messing it up). David Rowley	2014-04-17 16:12:24 -04:00
Bruce Momjian	83defef8c7	report stat() error in trigger file check Permissions might prevent the existence of the trigger file from being checked. Per report from Andres Freund	2014-04-17 11:55:57 -04:00
Heikki Linnakangas	2a8e1ac598	Set the all-visible flag on heap page before writing WAL record, not after. If we set the all-visible flag after writing WAL record, and XLogInsert takes a full-page image of the page, the image would not include the flag. We will then proceed to set the VM bit, which would then be set without the corresponding all-visible flag on the heap page. Found by comparing page images on master and standby, after writing/replaying each WAL record. (There is still a discrepancy: the all-visible flag won't be set after replaying the HEAP_CLEAN record, even though it is set in the master. However, it will be set when replaying the HEAP2_VISIBLE record and setting the VM bit, so the all-visible flag and VM bit are always consistent on the standby, even though they are momentarily out-of-sync with master) Backpatch to 9.3 where this code was introduced.	2014-04-17 17:47:50 +03:00
Tom Lane	5f86cbd714	Rename EXPLAIN ANALYZE's "total runtime" output to "execution time". Now that EXPLAIN also outputs a "planning time" measurement, the use of "total" here seems rather confusing: it sounds like it might include the planning time which of course it doesn't. Majority opinion was that "execution time" is a better label, so we'll call it that. This should be noted as a backwards incompatibility for tools that examine EXPLAIN ANALYZE output. In passing, I failed to resist the temptation to do a little editing on the materialized-view example affected by this change.	2014-04-16 20:48:59 -04:00

... 3 4 5 6 7 ...

25697 Commits