postgresql

Commit Graph

Author	SHA1	Message	Date
Heikki Linnakangas	98f58a30c1	Fix Hot-Standby initialization of clog and subtrans. These bugs can cause data loss on standbys started with hot_standby=on at the moment they start to accept read only queries, by marking committed transactions as uncommited. The likelihood of such corruptions is small unless the primary has a high transaction rate. `5a031a5556` fixed bugs in HS's startup logic by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state was reached, missing the fact that both clog and subtrans are written to before that. This only failed to fail in common cases because the usage of ExtendCLOG in procarray.c was superflous since clog extensions are actually WAL logged. f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the missing extensions of pg_subtrans due to the former commit's changes - which are not WAL logged - by performing the extensions when switching to a state > STANDBY_INITIALIZED and not performing xid assignments before that - again missing the fact that ExtendCLOG is unneccessary - but screwed up twice: Once because latestObservedXid wasn't updated anymore in that state due to the earlier commit and once by having an off-by-one error in the loop performing extensions. This means that whenever a CLOG_XACTS_PER_PAGE (32768 with default settings) boundary was crossed between the start of the checkpoint recovery started from and the first xl_running_xact record old transactions commit bits in pg_clog could be overwritten if they started and committed in that window. Fix this mess by not performing ExtendCLOG() in HS at all anymore since it's unneeded and evidently dangerous and by performing subtrans extensions even before reaching STANDBY_SNAPSHOT_PENDING. Analysis and patch by Andres Freund. Reported by Christophe Pettus. Backpatch down to 9.0, like the previous commit that caused this.	2013-11-22 14:45:41 +02:00
Heikki Linnakangas	1a3d104475	Avoid acquiring spinlock when checking if recovery has finished, for speed. RecoveryIsInProgress() can be called very frequently. During normal operation, it just checks a backend-local variable and returns quickly, but during hot standby, it checks a spinlock-protected shared variable. Those spinlock acquisitions can become a point of contention on a busy hot standby system. Replace the spinlock acquisition with a memory barrier. Per discussion with Andres Freund, Ants Aasma and Merlin Moncure.	2013-11-22 13:07:23 +02:00
Tom Lane	784e762e88	Support multi-argument UNNEST(), and TABLE() syntax for multiple functions. This patch adds the ability to write TABLE( function1(), function2(), ...) as a single FROM-clause entry. The result is the concatenation of the first row from each function, followed by the second row from each function, etc; with NULLs inserted if any function produces fewer rows than others. This is believed to be a much more useful behavior than what Postgres currently does with multiple SRFs in a SELECT list. This syntax also provides a reasonable way to combine use of column definition lists with WITH ORDINALITY: put the column definition list inside TABLE(), where it's clear that it doesn't control the ordinality column as well. Also implement SQL-compliant multiple-argument UNNEST(), by turning UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)). The SQL standard specifies TABLE() with only a single function, not multiple functions, and it seems to require an implicit UNNEST() which is not what this patch does. There may be something wrong with that reading of the spec, though, because if it's right then the spec's TABLE() is just a pointless alternative spelling of UNNEST(). After further review of that, we might choose to adopt a different syntax for what this patch does, but in any case this functionality seems clearly worthwhile. Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and significantly revised by me	2013-11-21 19:37:20 -05:00
Heikki Linnakangas	04eee1fa9e	More GIN refactoring. Split off the portion of ginInsertValue that inserts the tuple to current level into a separate function, ginPlaceToPage. ginInsertValue's charter is now to recurse up the tree to insert the downlink, when a page split is required. This is in preparation for a patch to change the way incomplete splits are handled, which will need to do these operations separately. And IMHO makes the code more readable anyway.	2013-11-20 17:01:33 +02:00
Heikki Linnakangas	501012631e	Refactor the internal GIN B-tree interface for forming a downlink. This creates a new gin-btree callback function for creating a downlink for a page. Previously, ginxlog.c duplicated the logic used during normal operation.	2013-11-20 16:57:41 +02:00
Heikki Linnakangas	04965ad40e	Further GIN refactoring. Merge some functions that were always called together. Makes the code little bit more readable.	2013-11-20 16:09:14 +02:00
Robert Haas	f1df4731ee	Use cstring_to_text_with_len when length is known. This avoids a potentially-expensive extra call to strlen(). David Rowley	2013-11-18 10:19:00 -05:00
Heikki Linnakangas	4c697d8f48	Count locked pages that don't need vacuuming as scanned. Previously, if VACUUM skipped vacuuming a page because it's pinned, it didn't count that page as scanned. However, that meant that relfrozenxid was not bumped up either, which prevented anti-wraparound vacuum from doing its job. Report by Миша Тюрин, analysis and patch by Sergey Burladyn and Jeff Janes. Backpatch to 9.2, where the skip-locked-pages behavior was introduced.	2013-11-18 09:51:09 +02:00
Tom Lane	f901bb50e3	Add make_date() and make_time() functions. Pavel Stehule, reviewed by Jeevan Chalke and Atri Sharma	2013-11-17 15:06:50 -05:00
Tom Lane	69c8fbac20	Improve performance of numeric sum(), avg(), stddev(), variance(), etc. This patch improves performance of most built-in aggregates that formerly used a NUMERIC or NUMERIC array as their transition type; this includes not only aggregates on numeric inputs, but some aggregates on integer inputs where overflow of an int8 value is a possibility. The code now uses a special-purpose data structure to avoid array construction and deconstruction overhead, as well as packing and unpacking overhead for numeric values. These aggregates' transition type is now declared as INTERNAL, since it doesn't correspond to any SQL data type. To keep the planner from thinking that that means a lot of storage will be used, we make use of the just-added pg_aggregate.aggtransspace feature. The space estimate is set to 128 bytes, which is at least in the right ballpark. Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra	2013-11-16 18:46:34 -05:00
Tom Lane	6cb86143e8	Allow aggregates to provide estimates of their transition state data size. Formerly the planner had a hard-wired rule of thumb for guessing the amount of space consumed by an aggregate function's transition state data. This estimate is critical to deciding whether it's OK to use hash aggregation, and in many situations the built-in estimate isn't very good. This patch adds a column to pg_aggregate wherein a per-aggregate estimate can be provided, overriding the planner's default, and infrastructure for setting the column via CREATE AGGREGATE. It may be that additional smarts will be required in future, perhaps even a per-aggregate estimation function. But this is already a step forward. This is extracted from a larger patch to improve the performance of numeric and int8 aggregates. I (tgl) thought it was worth reviewing and committing this infrastructure separately. In this commit, all built-in aggregates are given aggtransspace = 0, so no behavior should change. Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra	2013-11-16 16:03:40 -05:00
Tom Lane	f1f21b2d6f	Fix incorrect loop counts in tidbitmap.c. A couple of places that should have been iterating over WORDS_PER_CHUNK words were iterating over WORDS_PER_PAGE words instead. This thinko accidentally failed to fail, because (at least on common architectures with default BLCKSZ) WORDS_PER_CHUNK is a bit less than WORDS_PER_PAGE, and the extra words being looked at were always zero so nothing happened. Still, it's a bug waiting to happen if anybody ever fools with the parameters affecting TIDBitmap sizes, and it's a small waste of cycles too. So back-patch to all active branches. Etsuro Fujita	2013-11-15 18:34:14 -05:00
Tom Lane	f3b3b8d5be	Compute correct em_nullable_relids in get_eclass_for_sort_expr(). Bug #8591 from Claudio Freire demonstrates that get_eclass_for_sort_expr must be able to compute valid em_nullable_relids for any new equivalence class members it creates. I'd worried about this in the commit message for `db9f0e1d9a`, but claimed that it wasn't a problem because multi-member ECs should already exist when it runs. That is transparently wrong, though, because this function is also called by initialize_mergeclause_eclasses, which runs during deconstruct_jointree. The example given in the bug report (which the new regression test item is based upon) fails because the COALESCE() expression is first seen by initialize_mergeclause_eclasses rather than process_equivalence. Fixing this requires passing the appropriate nullable_relids set to get_eclass_for_sort_expr, and it requires new code to compute that set for top-level expressions such as ORDER BY, GROUP BY, etc. We store the top-level nullable_relids in a new field in PlannerInfo to avoid computing it many times. In the back branches, I've added the new field at the end of the struct to minimize ABI breakage for planner plugins. There doesn't seem to be a good alternative to changing get_eclass_for_sort_expr's API signature, though. There probably aren't any third-party extensions calling that function directly; moreover, if there are, they probably need to think about what to pass for nullable_relids anyway. Back-patch to 9.2, like the previous patch in this area.	2013-11-15 16:46:18 -05:00
Tom Lane	80e3a470ba	Minor comment corrections for sequence hashtable patch. There were enough typos in the comments to annoy me ...	2013-11-15 12:17:12 -05:00
Heikki Linnakangas	5cb719beee	Fix bogus hash table creation. Andres Freund	2013-11-15 14:23:40 +02:00
Heikki Linnakangas	21025d4a53	Use a hash table to store current sequence values. This speeds up nextval() and currval(), when you touch a lot of different sequences in the same backend. David Rowley	2013-11-15 12:29:38 +02:00
Robert Haas	c46c803f8a	Fix relfilenodemap.c's handling of cache invalidations. The old code entered a new hash table entry first, then scanned pg_class to determine what value to fill in, and then populated the entry. This fails to work properly if a cache invalidation happens as a result of opening pg_class. Repair. Along the way, get rid of the idea of blowing away the entire hash table as a method of processing invalidations. Instead, just delete all the entries one by one. This is probably not quite as cheap but it's simpler, and shouldn't happen often. Andres Freund	2013-11-13 10:52:59 -05:00
Heikki Linnakangas	07fca603b5	Fix bug in GIN posting tree root creation. The root page is filled with as many items as fit, and the rest are inserted using normal insertions. However, I fumbled the variable names, and the code actually memcpy'd all the items on the page, overflowing the buffer. While at it, rename the variable to make the distinction more clear. Reported by Teodor Sigaev. This bug was introduced by my recent refactorings, so no backpatching required.	2013-11-13 13:47:59 +02:00
Peter Eisentraut	aa04b323c3	Move variable closer to where it is used This avoids an unused variable warning on Windows when building without asserts From: David Rowley <dgrowleyml@gmail.com>	2013-11-13 06:26:27 -05:00
Tom Lane	ebefbb5fde	Fix failure with whole-row reference to a subquery. Simple oversight in commit `1cb108efb0` --- recursively examining a subquery output column is only sane if the original Var refers to a single output column. Found by Kevin Grittner.	2013-11-11 16:36:27 -05:00
Tom Lane	0b7e660d6c	Fix ruleutils pretty-printing to not generate trailing whitespace. The pretty-printing logic in ruleutils.c operates by inserting a newline and some indentation whitespace into strings that are already valid SQL. This naturally results in leaving some trailing whitespace before the newline in many cases; which can be annoying when processing the output with other tools, as complained of by Joe Abbate. We can fix that in a pretty localized fashion by deleting any trailing whitespace before we append a pretty-printing newline. In addition, we have to modify the code inserted by commit `2f582f76b1` so that we also delete trailing whitespace when transposing items from temporary buffers into the main result string, when a temporary item starts with a newline. This results in rather voluminous changes to the regression test results, but it's easily verified that they are only removal of trailing whitespace. Back-patch to 9.3, because the aforementioned commit resulted in many more cases of trailing whitespace than had occurred in earlier branches.	2013-11-11 13:36:38 -05:00
Tom Lane	648bd05b13	Re-allow duplicate aliases within aliased JOINs. Although the SQL spec forbids duplicate table aliases, historically we've allowed queries like SELECT ... FROM tab1 x CROSS JOIN (tab2 x CROSS JOIN tab3 y) z on the grounds that the aliased join (z) hides the aliases within it, therefore there is no conflict between the two RTEs named "x". The LATERAL patch broke this, on the misguided basis that "x" could be ambiguous if tab3 were a LATERAL subquery. To avoid breaking existing queries, it's better to allow this situation and complain only if tab3 actually does contain an ambiguous reference. We need only remove the check that was throwing an error, because the column lookup code is already prepared to handle ambiguous references. Per bug #8444.	2013-11-11 10:42:57 -05:00
Peter Eisentraut	001e114b8d	Fix whitespace issues found by git diff --check, add gitattributes Set per file type attributes in .gitattributes to fine-tune whitespace checks. With the associated cleanups, the tree is now clean for git	2013-11-10 14:48:29 -05:00
Heikki Linnakangas	ac4ab97ec0	Fix race condition in GIN posting tree page deletion. If a page is deleted, and reused for something else, just as a search is following a rightlink to it from its left sibling, the search would continue scanning whatever the new contents of the page are. That could lead to incorrect query results, or even something more curious if the page is reused for a different kind of a page. To fix, modify the search algorithm to lock the next page before releasing the previous one, and refrain from deleting pages from the leftmost branch of the tree. Add a new Concurrency section to the README, explaining why this works. There is a lot more one could say about concurrency in GIN, but that's for another patch. Backpatch to all supported versions.	2013-11-08 22:21:42 +02:00
Robert Haas	07cacba983	Add the notion of REPLICA IDENTITY for a table. Pending patches for logical replication will use this to determine which columns of a tuple ought to be considered as its candidate key. Andres Freund, with minor, mostly cosmetic adjustments by me	2013-11-08 12:30:43 -05:00
Tom Lane	b97ee66cc1	Make contain_volatile_functions/contain_mutable_functions look into SubLinks. This change prevents us from doing inappropriate subquery flattening in cases such as dangerous functions hidden inside a sub-SELECT in the targetlist of another sub-SELECT. That could result in unexpected behavior due to multiple evaluations of a volatile function, as in a recent complaint from Etienne Dube. It's been questionable from the very beginning whether these functions should look into subqueries (as noted in their comments), and this case seems to provide proof that they should. Because the new code only descends into SubLinks, not SubPlans or InitPlans, the change only affects the planner's behavior during prepjointree processing and not later on --- for example, you can still get it to use a volatile function in an indexqual if you wrap the function in (SELECT ...). That's a historical behavior, for sure, but it's reasonable given that the executor's evaluation rules for subplans don't depend on whether there are volatile functions inside them. In any case, we need to constrain the behavioral change as narrowly as we can to make this reasonable to back-patch.	2013-11-08 11:36:57 -05:00
Tom Lane	060b22a99a	Fix subtly-wrong volatility checking in BeginCopyFrom(). contain_volatile_functions() is best applied to the output of expression_planner(), not its input, so that insertion of function default arguments and constant-folding have been done. (See comments at CheckMutability, for instance.) It's perhaps unlikely that anyone will notice a difference in practice, but still we should do it properly. In passing, change variable type from Node* to Expr* to reduce the net number of casts needed. Noted while perusing uses of contain_volatile_functions().	2013-11-08 08:59:39 -05:00
Tom Lane	20803d7881	Make LOCK_PRINT & PROCLOCK_PRINT expand to ((void) 0) when not in use. This avoids warnings from more-anal-than-average compilers, and might prevent hidden syntax problems in the future. Andres Freund	2013-11-07 19:07:48 -05:00
Kevin Grittner	b64b5ccb6a	Silence benign warnings from clang version 3.0-6ubuntu3.	2013-11-07 16:35:43 -06:00
Tom Lane	c28b289bf3	Prevent display of dropped columns in row constraint violation messages. ExecBuildSlotValueDescription() printed "null" for each dropped column in a row being complained of by ExecConstraints(). This has some sanity in terms of the underlying implementation, but is of course pretty surprising to users. To fix, we must pass the target relation's descriptor to ExecBuildSlotValueDescription(), because the slot descriptor it had been using doesn't get labeled with attisdropped markers. Per bug #8408 from Maxim Boguk. Back-patch to 9.2 where the feature of printing row values in NOT NULL and CHECK constraint violation messages was introduced. Michael Paquier and Tom Lane	2013-11-07 14:41:36 -05:00
Tom Lane	5e900bc00f	Fix generation of MergeAppend plans for optimized min/max on expressions. Before jamming a desired targetlist into a plan node, one really ought to make sure the plan node can handle projections, and insert a buffering Result plan node if not. planagg.c forgot to do this, which is a hangover from the days when it only dealt with IndexScan plan types. MergeAppend doesn't project though, not to mention that it gets unhappy if you remove its possibly-resjunk sort columns. The code accidentally failed to fail for cases in which the min/max argument was a simple Var, because the new targetlist would be equivalent to the original "flat" tlist anyway. For any more complex case, it's been broken since 9.1 where we introduced the ability to optimize min/max using MergeAppend, as reported by Raphael Bauduin. Fix by duplicating the logic from grouping_planner that decides whether we need a Result node. In 9.2 and 9.1, this requires back-porting the tlist_same_exprs() function introduced in commit `4387cf956b`, else we'd uselessly add a Result node in cases that worked before. It's rather tempting to back-patch that whole commit so that we can avoid extra Result nodes in mainline cases too; but I'll refrain, since that code hasn't really seen all that much field testing yet.	2013-11-07 13:14:14 -05:00
Heikki Linnakangas	fde7172d93	Fix setting of right bound at GIN page split. Broken by my refactoring.	2013-11-07 19:45:07 +02:00
Tom Lane	8dace66e07	Add #ifdef guards for some POSIX error symbols that Windows doesn't like. Per buildfarm results. It looks like the older the Windows version, the more errno codes it hasn't got ...	2013-11-06 20:22:42 -05:00
Tom Lane	8e68816cc2	Be more robust when strerror() doesn't give a useful result. glibc, at least, is capable of returning "???" instead of anything useful if it doesn't like the setting of LC_CTYPE. If this happens, or in the previously-known case of strerror() returning an empty string, try to print the C macro name for the error code ("EACCES" etc). Only if we don't have the error code in our compiled-in list of popular error codes (which covers most though not quite all of what's called out in the POSIX spec) will we fall back to printing a numeric error code. This should simplify debugging. Note that this functionality is currently only provided for %m in backend ereport/elog messages. That may be sufficient, since we don't fool with the locale environment in frontend clients, but it's foreseeable that we might want similar code in libpq for instance. There was some talk of back-patching this, but let's see how the buildfarm likes it first. It seems likely that at least some of the POSIX-defined error code symbols don't exist on all platforms. I don't want to clutter the entire list with #ifdefs, but we may need more than are here now. MauMau, edited by me	2013-11-06 15:50:17 -05:00
Tom Lane	bb45c64041	Support default arguments and named-argument notation for window functions. These things didn't work because the planner omitted to do the necessary preprocessing of a WindowFunc's argument list. Add the few dozen lines of code needed to handle that. Although this sounds like a feature addition, it's really a bug fix because the default-argument case was likely to crash previously, due to lack of checking of the number of supplied arguments in the built-in window functions. It's not a security issue because there's no way for a non-superuser to create a window function definition with defaults that refers to a built-in C function, but nonetheless people might be annoyed that it crashes rather than producing a useful error message. So back-patch as far as the patch applies easily, which turns out to be 9.2. I'll put a band-aid in earlier versions as a separate patch. (Note that these features still don't work for aggregates, and fixing that case will be harder since we represent aggregate arg lists as target lists not bare expression lists. There's no crash risk though because CREATE AGGREGATE doesn't accept defaults, and we reject named-argument notation when parsing an aggregate call.)	2013-11-06 13:33:09 -05:00
Kevin Grittner	5829082a57	Keep heap open until new heap generated in RMV. Early close became apparent when invalidation messages were processed in a new location under CLOBBER_CACHE_ALWAYS builds, due to additional locking. Back-patch to 9.3	2013-11-06 12:27:52 -06:00
Heikki Linnakangas	0ea53256a8	Fix missing argument and function prototypes. Not sure how I missed these in previous commit.	2013-11-06 11:22:58 +02:00
Heikki Linnakangas	ecaa4708e5	Misc GIN refactoring. Merge the isEnoughSpace and placeToPage functions in the b-tree interface into one function that tries to put a tuple on page, and returns false if it doesn't fit. Move createPostingTree function to gindatapage.c, and change its contract so that it can be passed more items than fit on the root page. It's in a better position than the callers to know how many items fit. Move ginMergeItemPointers out of gindatapage.c, into a separate file. These changes make no difference now, but reduce the footprint of Alexander Korotkov's upcoming patch to pack item pointers more tightly.	2013-11-06 10:32:09 +02:00
Tom Lane	920c8261d5	Improve the error message given for modifying a window with frame clause. For rather inscrutable reasons, SQL:2008 disallows copying-and-modifying a window definition that has any explicit framing clause. The error message we gave for this only made sense if the referencing window definition itself contains an explicit framing clause, which it might well not. Moreover, in the context of an OVER clause it's not exactly obvious that "OVER (windowname)" implies copy-and-modify while "OVER windowname" does not. This has led to multiple complaints, eg bug #5199 from Iliya Krapchatov. Change to a hopefully more intelligible error message, and in the case where we have just "OVER (windowname)", add a HINT suggesting that omitting the parentheses will fix it. Also improve the related documentation. Back-patch to all supported branches.	2013-11-05 21:58:08 -05:00
Kevin Grittner	2636ecf78b	Lock relation used to generate fresh data for RMV. The relation should not be accessible to any other process, but it should be locked for consistency. Since this is not known to cause any bug, it will not be back-patch, at least for now. Per report from Andres Freund	2013-11-05 15:36:33 -06:00
Tom Lane	6331de1d44	Fix some obsolete information in src/backend/optimizer/README. Constant quals aren't handled the same way they used to be. Also, add mention of a couple more major steps in grouping_planner. Per complaint a couple months back from Etsuro Fujita.	2013-11-05 11:31:35 -05:00
Kevin Grittner	732758db4c	Fix breakage of MV column name list usage. Per bug report from Tomonari Katsumata. Back-patch to 9.3.	2013-11-04 14:31:07 -06:00
Robert Haas	dddc34408a	Fix format code used to print dsm request sizes. Per report from Peter Eisentraut.	2013-11-04 11:22:03 -05:00
Tom Lane	e36ce0c7f7	Get rid of more cases of the "must detoast before output function" meme. I missed that json.c was doing this too, because for some bizarre reason it wasn't doing it adjacent to the output function call.	2013-11-03 11:55:37 -05:00
Tom Lane	b006f4ddb9	Prevent memory leaks from accumulating across printtup() calls. Historically, printtup() has assumed that it could prevent memory leakage by pfree'ing the string result of each output function and manually managing detoasting of toasted values. This amounts to assuming that datatype output functions never leak any memory internally; an assumption we've already decided to be bogus elsewhere, for example in COPY OUT. range_out in particular is known to leak multiple kilobytes per call, as noted in bug #8573 from Godfried Vanluffelen. While we could go in and fix that leak, it wouldn't be very notationally convenient, and in any case there have been and undoubtedly will again be other leaks in other output functions. So what seems like the best solution is to run the output functions in a temporary memory context that can be reset after each row, as we're doing in COPY OUT. Some quick experimentation suggests this is actually a tad faster than the retail pfree's anyway. This patch fixes all the variants of printtup, except for debugtup() which is used in standalone mode. It doesn't seem worth worrying about query-lifespan leaks in standalone mode, and fixing that case would be a bit tedious since debugtup() doesn't currently have any startup or shutdown functions. While at it, remove manual detoast management from several other output-function call sites that had copied it from printtup(). This doesn't make a lot of difference right now, but in view of recent discussions about supporting "non-flattened" Datums, we're going to want that code gone eventually anyway. Back-patch to 9.2 where range_out was introduced. We might eventually decide to back-patch this further, but in the absence of known major leaks in older output functions, I'll refrain for now.	2013-11-03 11:33:05 -05:00
Kevin Grittner	2a781d57dc	Acquire appropriate locks when rewriting during RMV. Since the query has not been freshly parsed when executing REFRESH MATERIALIZED VIEW, locks must be explicitly taken before rewrite. Backpatch to 9.3. Andres Freund	2013-11-02 19:18:08 -05:00
Kevin Grittner	be420fa02e	Fix subquery reference to non-populated MV in CMV. A subquery reference to a matview should be allowed by CREATE MATERIALIZED VIEW WITH NO DATA, just like a direct reference is. Per bug report from Laurent Sartran. Backpatch to 9.3.	2013-11-02 18:38:17 -05:00
Tom Lane	24ace4053d	Retry after buffer locking failure during SPGiST index creation. The original coding thought this case was impossible, but it can happen if the bgwriter or checkpointer processes decide to write out an index page while creation is still proceeding, leading to a bogus "unexpected spgdoinsert() failure" error. Problem reported by Jonathan S. Katz. Teodor Sigaev	2013-11-02 16:45:42 -04:00
Tom Lane	bffd1ce92c	Ensure all files created for a single BufFile have the same resource owner. Callers expect that they only have to set the right resource owner when creating a BufFile, not during subsequent operations on it. While we could insist this be fixed at the caller level, it seems more sensible for the BufFile to take care of it. Without this, some temp files belonging to a BufFile can go away too soon, eg at the end of a subtransaction, leading to errors or crashes. Reported and fixed by Andres Freund. Back-patch to all active branches.	2013-11-01 16:09:48 -04:00
Tom Lane	45f64f1bbf	Remove CTimeZone/HasCTZSet, root and branch. These variables no longer have any useful purpose, since there's no reason to special-case brute force timezones now that we have a valid session_timezone setting for them. Remove the variables, and remove the SET/SHOW TIME ZONE code that deals with them. The user-visible impact of this is that SHOW TIME ZONE will now show a POSIX-style zone specification, in the form "<+-offset>-+offset", rather than an interval value when a brute-force zone has been set. While perhaps less intuitive, this is a better definition than before because it's actually possible to give that string back to SET TIME ZONE and get the same behavior, unlike what used to happen. We did not previously mention the angle-bracket syntax when describing POSIX timezone specifications; add some documentation so that people can figure out what these strings do. (There's still quite a lot of undocumented functionality there, but anybody who really cares can go read the POSIX spec to find out about it. In practice most people seem to prefer Olsen-style city names anyway.)	2013-11-01 13:57:31 -04:00
Tom Lane	1c8a7f617f	Remove internal uses of CTimeZone/HasCTZSet. The only remaining places where we actually look at CTimeZone/HasCTZSet are abstime2tm() and timestamp2tm(). Now that session_timezone is always valid, we can remove these special cases. The caller-visible impact of this is that these functions now always return a valid zone abbreviation if requested, whereas before they'd return a NULL pointer if a brute-force timezone was in use. In the existing code, the only place I can find that changes behavior is to_char(), whose TZ format code will now print something useful rather than nothing for such zones. (In the places where the returned zone abbreviation is passed to EncodeDateTime, the lack of visible change is because we've chosen the abbreviation used for these zones to match what EncodeTimezone would have printed.) It's likely that there is now a fair amount of removable dead code around the call sites, namely anything that's meant to cope with getting a NULL timezone abbreviation, but I've not made an effort to root that out. This could be back-patched if we decide we'd like to fix to_char()'s behavior in the back branches, but there doesn't seem to be much enthusiasm for that at present.	2013-11-01 12:51:27 -04:00
Tom Lane	631dc390f4	Fix some odd behaviors when using a SQL-style simple GMT offset timezone. Formerly, when using a SQL-spec timezone setting with a fixed GMT offset (called a "brute force" timezone in the code), the session_timezone variable was not updated to match the nominal timezone; rather, all code was expected to ignore session_timezone if HasCTZSet was true. This is of course obviously fragile, though a search of the code finds only timeofday() failing to honor the rule. A bigger problem was that DetermineTimeZoneOffset() supposed that if its pg_tz parameter was pointer-equal to session_timezone, then HasCTZSet should override the parameter. This would cause datetime input containing an explicit zone name to be treated as referencing the brute-force zone instead, if the zone name happened to match the session timezone that had prevailed before installing the brute-force zone setting (as reported in bug #8572). The same malady could affect AT TIME ZONE operators. To fix, set up session_timezone so that it matches the brute-force zone specification, which we can do using the POSIX timezone definition syntax "<abbrev>offset", and get rid of the bogus lookaside check in DetermineTimeZoneOffset(). Aside from fixing the erroneous behavior in datetime parsing and AT TIME ZONE, this will cause the timeofday() function to print its result in the user-requested time zone rather than some previously-set zone. It might also affect results in third-party extensions, if there are any that make use of session_timezone without considering HasCTZSet, but in all cases the new behavior should be saner than before. Back-patch to all supported branches.	2013-11-01 12:13:18 -04:00
Robert Haas	cacbdd7810	Use appendStringInfoString instead of appendStringInfo where possible. This shaves a few cycles, and generally seems like good programming practice. David Rowley	2013-10-31 10:55:59 -04:00
Robert Haas	343bb134ea	Avoid too-large shift on 32-bit Windows. Apparently, shifts greater than or equal to the width of the type are undefined, and can surprisingly produce a non-zero value. Amit Kapila, with a comment by me.	2013-10-30 09:14:56 -04:00
Tom Lane	9a9473f3cc	Prevent using strncpy with src == dest in TupleDescInitEntry. The C and POSIX standards state that strncpy's behavior is undefined when source and destination areas overlap. While it remains dubious whether any implementations really misbehave when the pointers are exactly equal, some platforms are now starting to force the issue by complaining when an undefined call occurs. (In particular OS X 10.9 has been seen to dump core here, though the exact set of circumstances needed to trigger that remain elusive. Similar behavior can be expected to be optional on Linux and other platforms in the near future.) So tweak the code to explicitly do nothing when nothing need be done. Back-patch to all active branches. In HEAD, this also lets us get rid of an exception in valgrind.supp. Per discussion of a report from Matthias Schmitt.	2013-10-28 20:49:24 -04:00
Robert Haas	d2aecaea15	Modify dynamic shared memory code to use Size rather than uint64. This is more consistent with what we do elsewhere.	2013-10-28 12:12:06 -04:00
Tom Lane	c2b51cf190	Improve documentation about usage of FDW validator functions. SGML documentation, as well as code comments, failed to note that an FDW's validator will be applied to foreign-table options for foreign tables using the FDW. Etsuro Fujita	2013-10-28 10:28:35 -04:00
Noah Misch	c50b7c09d8	Add large object functions catering to SQL callers. With these, one need no longer manipulate large object descriptors and extract numeric constants from header files in order to read and write large object contents from SQL. Pavel Stehule, reviewed by Rushabh Lathia.	2013-10-27 22:56:54 -04:00
Tom Lane	43fe90f66a	Suppress -0 in the C field of lines computed by line_construct_pts(). It's not entirely clear why some PPC machines are generating -0 here, since the underlying computation should be exactly 0 - 0. Perhaps there's some wider-than-nominal-precision calculations happening? Anyway, the best way to avoid platform-dependent results seems to be to explicitly reset -0 to regular zero.	2013-10-25 15:55:15 -04:00
Tom Lane	3147acd63e	Use improved vsnprintf calling logic in more places. When we are using a C99-compliant vsnprintf implementation (which should be most places, these days) it is worth the trouble to make use of its report of how large the buffer needs to be to succeed. This patch adjusts stringinfo.c and some miscellaneous usages in pg_dump to do that, relying on the logic recently added in libpgcommon's psprintf.c. Since these places want to know the number of bytes written once we succeed, modify the API of pvsnprintf() to report that. There remains near-duplicate logic in pqexpbuffer.c, but since that code is in libpq, psprintf.c's approach of exit()-on-error isn't appropriate for use there. Also note that I didn't bother touching the multitude of places that call (v)snprintf without any attempt to provide a resizable buffer. Release-note-worthy incompatibility: the API of appendStringInfoVA() changed. If there's any third-party code that's calling that directly, it will need tweaking along the same lines as in this patch. David Rowley and Tom Lane	2013-10-24 21:43:57 -04:00
Heikki Linnakangas	98c50656ca	Increase the number of different values used when seeding random(). When a backend process is forked, we initialize the system's random number generator with srandom(). The seed used is derived from the backend's pid and the timestamp. However, we only used the microseconds part of the timestamp, and it was XORed with the pid, so the total range of different seed values chosen was 0-999999. That's quite limited. Change the code to also use the seconds part of the timestamp in the seed, and shift the microseconds so that all 32 bits of the seed are used. Honza Horak	2013-10-24 17:00:18 +03:00
Heikki Linnakangas	138184adc5	Plug memory leak when reloading config file. The absolute path to config file was not pfreed. There are probably more small leaks here and there in the config file reload code and assign hooks, and in practice no-one reloads the config files frequently enough for it to be a problem, but this one is trivial enough that might as well fix it. Backpatch to 9.3 where the leak was introduced.	2013-10-24 15:27:40 +03:00
Heikki Linnakangas	bb598456dc	Fix memory leak when an empty ident file is reloaded. Hari Babu	2013-10-24 14:03:26 +03:00
Heikki Linnakangas	4d6d425ab8	Fix typos in comments.	2013-10-24 11:50:02 +03:00
Heikki Linnakangas	83eb54001c	Fix two bugs in setting the vm bit of empty pages. Use a critical section when setting the all-visible flag on an empty page, and WAL-logging it. log_newpage_buffer() contains an assertion that it must be called inside a critical section, and it's the right thing to do when modifying a buffer anyway. Also, the page should be marked dirty before calling log_newpage_buffer(), per the comment in log_newpage_buffer() and src/backend/access/transam/README. Patch by Andres Freund, in response to my report. Backpatch to 9.2, like the patch that introduced these bugs (`a6370fd9`).	2013-10-23 14:24:37 +03:00
Tom Lane	5f1ab46101	Suppress a couple of compiler warnings seen with older gcc versions. To wit, bgworker.c: In function `RegisterDynamicBackgroundWorker': bgworker.c:761: warning: `generation' might be used uninitialized in this function dsm_impl.c: In function `dsm_impl_op': dsm_impl.c:197: warning: control reaches end of non-void function Neither of these represent actual bugs, but we may as well tweak the code so that more compilers can tell that. This won't change the generated code on compilers that do recognize that the cases are unreachable.	2013-10-22 21:31:57 -04:00
Tom Lane	2c66f9924c	Replace pg_asprintf() with psprintf(). This eliminates an awkward coding pattern that's also unnecessarily inconsistent with backend coding. psprintf() is now the thing to use everywhere.	2013-10-22 19:40:26 -04:00
Tom Lane	09a89cb5fc	Get rid of use of asprintf() in favor of a more portable implementation. asprintf(), aside from not being particularly portable, has a fundamentally badly-designed API; the psprintf() function that was added in passing in the previous patch has a much better API choice. Moreover, the NetBSD implementation that was borrowed for the previous patch doesn't work with non-C99-compliant vsnprintf, which is something we still have to cope with on some platforms; and it depends on va_copy which isn't all that portable either. Get rid of that code in favor of an implementation similar to what we've used for many years in stringinfo.c. Also, move it into libpgcommon since it's not really libpgport material. I think this patch will be enough to turn the buildfarm green again, but there's still cosmetic work left to do, namely get rid of pg_asprintf() in favor of using psprintf(). That will come in a followon patch.	2013-10-22 18:42:13 -04:00
Peter Eisentraut	586a8fc75b	Make use of psprintf() in recent changes	2013-10-22 07:04:41 -04:00
Tom Lane	2885881147	Fix blatantly broken record_image_cmp() logic for pass-by-value fields. Doesn't anybody here pay attention to compiler warnings?	2013-10-22 00:38:53 -04:00
Noah Misch	709170b790	Consistently use unsigned arithmetic for alignment calculations. This avoids an assumption about the signed number representation. It is anticipated to have no functional changes on supported configurations; many two's complement assumptions remain elsewhere. Per a suggestion from Andres Freund.	2013-10-20 21:04:52 -04:00
Peter Eisentraut	713a9f210d	Add libpgcommon to backend gettext source files This ought to have been done when libpgcommon was split off from libpgport.	2013-10-19 13:49:05 -04:00
Robert Haas	cab5dc5daf	Allow only some columns of a view to be auto-updateable. Previously, unless all columns were auto-updateable, we wouldn't inserts, updates, or deletes, or at least not without a rule or trigger; now, we'll allow inserts and updates that target only the auto-updateable columns, and deletes even if there are no auto-updateable columns at all provided the view definition is otherwise suitable. Dean Rasheed, reviewed by Marko Tiikkaja	2013-10-18 10:35:36 -04:00
Robert Haas	523beaa11b	Provide a reliable mechanism for terminating a background worker. Although previously-introduced APIs allow the process that registers a background worker to obtain the worker's PID, there's no way to prevent a worker that is not currently running from being restarted. This patch introduces a new API TerminateBackgroundWorker() that prevents the background worker from being restarted, terminates it if it is currently running, and causes it to be unregistered if or when it is not running. Patch by me. Review by Michael Paquier and KaiGai Kohei.	2013-10-18 10:23:11 -04:00
Robert Haas	ea91a6be89	Remove IRIX port. Development of IRIX has been discontinued, and support is scheduled to end in December of 2013. Therefore, there will be no supported versions of this operating system by the time PostgreSQL 9.4 is released. Furthermore, we have no maintainer for this platform.	2013-10-18 08:14:21 -04:00
Robert Haas	81051a86bc	Remove spinlock support for SINIX, Sun3, and NS32K. All of these platforms are very much obsolete. As far as I can determine, the last version of SINIX, later renamed Reliant, occurred some time between 2002 and 2005. The last release of SunOS that would run on a sun3 was released in November of 1991; the last release of OpenBSD which supported that platform was in 2001. The highest clock speed of any processor in the family was 25MHz. The NS32K (national semiconductor 320xx) architecture was retired in 1990. Support can be re-added if a maintainer emerges for any of these platforms, but it seems unlikely. Reviewed by Andres Freund.	2013-10-17 12:02:05 -04:00
Alvaro Herrera	86029b31e5	Silence compiler warning when SSL not in use Per Jaime Casanova and Vik Fearing	2013-10-17 11:28:50 -03:00
Bruce Momjian	7778ddc7a2	Allow 5+ digit years for non-ISO timestamp/date strings, where appropriate Report from Haribabu Kommi	2013-10-16 13:22:55 -04:00
Robert Haas	e515861367	In dsm_impl_windows, don't error out when the segment already exists. This is the behavior of the other implementations, and the behavior expected by the callers of this function. Amit Kapila	2013-10-14 11:48:49 -04:00
Robert Haas	05a0283e7a	Fix details missed by dynamic shared memory patch. Additional documentation update, and a comment fix. Both issues reported by Amit Kapila.	2013-10-14 08:00:26 -04:00
Peter Eisentraut	5b6d08cd29	Add use of asprintf() Add asprintf(), pg_asprintf(), and psprintf() to simplify string allocation and composition. Replacement implementations taken from NetBSD. Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Asif Naeem <anaeem.it@gmail.com>	2013-10-13 00:09:18 -04:00
Kevin Grittner	4cbb646334	Fix several possibly non-portable gaffs in record_image_ops. Sparc machines in the buildfarm were made happy by the previous fix, but PowerPC machines still are still failing. Hopefully this will cure that.	2013-10-11 13:02:52 -05:00
Alvaro Herrera	ada01014d4	Use $(PERL) to invoke duplicate_oids Per buildfarm failure reported by smilodon	2013-10-10 23:45:38 -03:00
Alvaro Herrera	31cf1a1a43	Rework SSL renegotiation code The existing renegotiation code was home for several bugs: it might erroneously report that renegotiation had failed; it might try to execute another renegotiation while the previous one was pending; it failed to terminate the connection if the renegotiation never actually took place; if a renegotiation was started, the byte count was reset, even if the renegotiation wasn't completed (this isn't good from a security perspective because it means continuing to use a session that should be considered compromised due to volume of data transferred.) The new code is structured to avoid these pitfalls: renegotiation is started a little earlier than the limit has expired; the handshake sequence is retried until it has actually returned successfully, and no more than that, but if it fails too many times, the connection is closed. The byte count is reset only when the renegotiation has succeeded, and if the renegotiation byte count limit expires, the connection is terminated. This commit only touches the master branch, because some of the changes are controversial. If everything goes well, a back-patch might be considered. Per discussion started by message 20130710212017.GB4941@eldon.alvh.no-ip.org	2013-10-10 23:45:20 -03:00
Peter Eisentraut	5dd41f3574	Remove maintainer-check target, fold into normal build make maintainer-check was obscure and rarely called in practice, and many breakages were missed. Fold everything that make maintainer-check used to do into the normal build. Specifically: - Call duplicate_oids when genbki.pl is called. - Check for tabs in SGML files when the documentation is built. - Run msgfmt with the -c option during the regular build. Add an additional configure check to see whether we are using the GNU version. (make maintainer-check probably used to fail with non-GNU msgfmt.) Keep maintainer-check as around as phony target for the time being in case anyone is calling it. But it won't do anything anymore.	2013-10-10 20:11:56 -04:00
Kevin Grittner	15e46fd1dd	Fix bug in record_image_ops on big endian machines. The buildfarm pointed out the problem. Fix based on suggestion by Robert Haas.	2013-10-10 11:25:30 -05:00
Andrew Dunstan	4d212bac17	json_typeof function. Andrew Tipton.	2013-10-10 12:21:59 -04:00
Robert Haas	4b7b9a7904	Fix incorrect use of shm_unlink where unlink should be used. Per buildfarm.	2013-10-10 10:57:10 -04:00
Peter Eisentraut	261c7d4b65	Revive line type Change the input/output format to {A,B,C}, to match the internal representation. Complete the implementations of line_in, line_out, line_recv, line_send. Remove comments and error messages about the line type not being implemented. Add regression tests for existing line operators and functions. Reviewed-by: rui hua <365507506hua@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com>	2013-10-09 22:34:38 -04:00
Robert Haas	0ac5e5a7e1	Allow dynamic allocation of shared memory segments. Patch by myself and Amit Kapila. Design help from Noah Misch. Review by Andres Freund.	2013-10-09 21:05:02 -04:00
Kevin Grittner	f566515192	Add record_image_ops opclass for matview concurrent refresh. REFRESH MATERIALIZED VIEW CONCURRENTLY was broken for any matview containing a column of a type without a default btree operator class. It also did not produce results consistent with a non- concurrent REFRESH or a normal view if any column was of a type which allowed user-visible differences between values which compared as equal according to the type's default btree opclass. Concurrent matview refresh was modified to use the new operators to solve these problems. Documentation was added for record comparison, both for the default btree operator class for record, and the newly added operators. Regression tests now check for proper behavior both for a matview with a box column and a matview containing a citext column. Reviewed by Steve Singer, who suggested some of the doc language.	2013-10-09 14:26:09 -05:00
Bruce Momjian	0c6b675076	Centralize effective_cache_size default setting	2013-10-09 08:33:12 -04:00
Bruce Momjian	96dfa6ec0d	Adjust the effective_cache_size default for standalone backends	2013-10-08 23:53:39 -04:00
Bruce Momjian	6b82f78ff9	Again move function where we set effective_cache_size's default	2013-10-08 23:12:45 -04:00
Bruce Momjian	cbafd6618a	Move new effective_cache_size function Previously set_default_effective_cache_size() could not handle fork, non-fork, and bootstrap cases.	2013-10-08 22:41:23 -04:00
Bruce Momjian	bf46524b31	Fix C comment in check_effective_cache_size()	2013-10-08 19:25:26 -04:00
Bruce Momjian	6648775028	Update postgres.conf.sample for effective_cache_size's new default	2013-10-08 12:50:05 -04:00
Bruce Momjian	ee1e5662d8	Auto-tune effective_cache size to be 4x shared buffers	2013-10-08 12:12:24 -04:00
Heikki Linnakangas	5962519b36	TYPEALIGN doesn't work on int64 on 32-bit platforms. The TYPEALIGN macro, and the related ones like MAXALIGN, don't work with values larger than intptr_t, because TYPEALIGN casts the argument to intptr_t to do the arithmetic. That's not a problem when dealing with pointers or lengths or offsets related to pointers, but the XLogInsert scaling patch added a call to MAXALIGN with an XLogRecPtr argument. To fix, add wider variants of the macros, called TYPEALIGN64 and MAXALIGN64, which are just like the existing variants but work with uint64 instead of intptr_t. Report and patch by David Rowley, analysis by Andres Freund.	2013-10-08 01:59:57 +03:00
Heikki Linnakangas	81fbbfe335	Fix bugs in SSI tuple locking. 1. In heap_hot_search_buffer(), the PredicateLockTuple() call is passed wrong offset number. heapTuple->t_self is set to the tid of the first tuple in the chain that's visited, not the one actually being read. 2. CheckForSerializableConflictIn() uses the tuple's t_ctid field instead of t_self to check for exiting predicate locks on the tuple. If the tuple was updated, but the updater rolled back, t_ctid points to the aborted dead tuple. Reported by Hannu Krosing. Backpatch to 9.1.	2013-10-08 00:18:43 +03:00

1 2 3 4 5 ...

13695 Commits