postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	7f41a472ab	Update release notes for 9.1.4, 9.0.8, 8.4.12, 8.3.19.	2012-05-31 19:03:57 -04:00
Peter Eisentraut	c272fcd7f6	Translation updates	2012-05-31 23:17:11 +03:00
Tom Lane	e4f088467c	Update time zone data files to tzdata release 2012c. DST law changes in Antarctica, Armenia, Chile, Cuba, Falkland Islands, Gaza, Haiti, Hebron, Morocco, Syria, Tokelau Islands. Historical corrections for Canada.	2012-05-31 00:48:23 -04:00
Tom Lane	8851d5e9bf	Ignore SECURITY DEFINER and SET attributes for a PL's call handler. It's not very sensible to set such attributes on a handler function; but if one were to do so, fmgr.c went into infinite recursion because it would call fmgr_security_definer instead of the handler function proper. There is no way for fmgr_security_definer to know that it ought to call the handler and not the original function referenced by the FmgrInfo's fn_oid, so it tries to do the latter, causing the whole process to start over again. Ordinarily such misconfiguration of a procedural language's handler could be written off as superuser error. However, because we allow non-superuser database owners to create procedural languages and the handler for such a language becomes owned by the database owner, it is possible for a database owner to crash the backend, which ideally shouldn't be possible without superuser privileges. In 9.2 and up we will adjust things so that the handler functions are always owned by superusers, but in existing branches this is a minor security fix. Problem noted by Noah Misch (after several of us had failed to detect it :-(). This is CVE-2012-2655.	2012-05-30 23:28:27 -04:00
Tom Lane	4d3482a741	Expand the allowed range of timezone offsets to +/-15:59:59 from Greenwich. We used to only allow offsets less than +/-13 hours, then it was +/14, then it was +/-15. That's still not good enough though, as per today's bug report from Patric Bechtel. This time I actually looked through the Olson timezone database to find the largest offsets used anywhere. The winners are Asia/Manila, at -15:56:00 until 1844, and America/Metlakatla, at +15:13:42 until 1867. So we'd better allow offsets less than +/-16 hours. Given the history, we are way overdue to have some greppable #define symbols controlling this, so make some ... and also remove an obsolete comment that didn't get fixed the last time. Back-patch to all supported branches.	2012-05-30 19:58:59 -04:00
Tom Lane	dd957a5bb9	Fix incorrect password transformation in contrib/pgcrypto's DES crypt(). Overly tight coding caused the password transformation loop to stop examining input once it had processed a byte equal to 0x80. Thus, if the given password string contained such a byte (which is possible though not highly likely in UTF8, and perhaps also in other non-ASCII encodings), all subsequent characters would not contribute to the hash, making the password much weaker than it appears on the surface. This would only affect cases where applications used DES crypt() to encode passwords before storing them in the database. If a weak password has been created in this fashion, the hash will stop matching after this update has been applied, so it will be easy to tell if any passwords were unexpectedly weak. Changing to a different password would be a good idea in such a case. (Since DES has been considered inadequately secure for some time, changing to a different encryption algorithm can also be recommended.) This code, and the bug, are shared with at least PHP, FreeBSD, and OpenBSD. Since the other projects have already published their fixes, there is no point in trying to keep this commit private. This bug has been assigned CVE-2012-2143, and credit for its discovery goes to Rubin Xu and Joseph Bonneau.	2012-05-30 10:53:48 -04:00
Tom Lane	b3d9db4617	Teach AbortOutOfAnyTransaction to clean up partially-started transactions. AbortOutOfAnyTransaction failed to do anything if the state it saw on entry corresponded to failing partway through StartTransaction. I fixed AbortCurrentTransaction to cope with that case way back in commit `60b2444cc3`, but evidently overlooked that AbortOutOfAnyTransaction should do likewise. Back-patch to all supported branches. It's not clear that this omission has any more-than-cosmetic consequences, but it's also not clear that it doesn't, so back-patching seems the least risky choice.	2012-05-28 23:57:32 -04:00
Tom Lane	422022b120	Prevent synchronized scanning when systable_beginscan chooses a heapscan. The only interesting-for-performance case wherein we force heapscan here is when we're rebuilding the relcache init file, and the only such case that is likely to be examining a catalog big enough to be syncscanned is RelationBuildTupleDesc. But the early-exit optimization in that code gets broken if we start the scan at a random place within the catalog, so that allowing syncscan is actually a big deoptimization if pg_attribute is large (at least for the normal case where the rows for core system catalogs have never been changed since initdb). Hence, prevent syncscan here. Per my testing pursuant to complaints from Jeff Frost and Greg Sabino Mullane, though neither of them seem to have actually hit this specific problem. Back-patch to 8.3, where syncscan was introduced.	2012-05-26 19:10:19 -04:00
Tom Lane	6f163609bd	Fix string truncation to be multibyte-aware in text_name and bpchar_name. Previously, casts to name could generate invalidly-encoded results. Also, make these functions match namein() more exactly, by consistently using palloc0() instead of ad-hoc zeroing code. Back-patch to all supported branches. Karl Schnaitter and Tom Lane	2012-05-25 17:35:24 -04:00
Tom Lane	bd43c50a5b	Use binary search instead of brute-force scan in findNamespace(). The previous coding presented a significant bottleneck when dumping databases containing many thousands of schemas, since the total time spent searching would increase roughly as O(N^2) in the number of objects. Noted by Jeff Janes, though I rewrote his proposed patch to use the existing findObjectByOid infrastructure. Since this is a longstanding performance bug, backpatch to all supported versions.	2012-05-25 14:35:59 -04:00
Tom Lane	c994b9211f	Ensure that seqscans check for interrupts at least once per page. If a seqscan encounters many consecutive pages containing only dead tuples, it can remain in the loop in heapgettup for a long time, and there was no CHECK_FOR_INTERRUPTS anywhere in that loop. This meant there were real-world situations where a query would be effectively uncancelable for long stretches. Add a check placed to occur once per page, which should be enough to provide reasonable response time without adding any measurable overhead. Report and patch by Merlin Moncure (though I tweaked it a bit). Back-patch to all supported branches.	2012-05-22 19:42:28 -04:00
Heikki Linnakangas	5761556250	Fix bug in to_tsquery(). We were using memcpy() to copy to a possibly overlapping memory region, which is a no-no. Use memmove() instead.	2012-05-15 19:27:00 +03:00
Tom Lane	fcc0ba313b	Fix Windows implementation of PGSemaphoreLock. The original coding failed to reset ImmediateInterruptOK before returning, which would potentially allow a subsequent query-cancel interrupt to be accepted at an unsafe point. This is a really nasty bug since it's so hard to predict the consequences, but they could be unpleasant. Also, ensure that signal handlers are serviced before this function returns, even if the semaphore is already set. This should make the behavior more like Unix. Back-patch to all supported versions.	2012-05-10 13:36:33 -04:00
Magnus Hagander	e91f00148d	Remove link to ODBCng project from the docs. This backatches Heikki's patch in `140a4fbf1a` to make sure the documentation on the website gets updated, since we're regularly receiving complains about this link.	2012-05-03 13:03:00 +02:00
Tom Lane	092d1d9dd8	Fix printing of whole-row Vars at top level of a SELECT targetlist. Normally whole-row Vars are printed as "tabname.". However, that does not work at top level of a targetlist, because per SQL standard the parser will think that the "" should result in column-by-column expansion; which is not at all what a whole-row Var implies. We used to just print the table name in such cases, which works most of the time; but it fails if the table name matches a column name available anywhere in the FROM clause. This could lead for instance to a view being interpreted differently after dump and reload. Adding parentheses doesn't fix it, but there is a reasonably simple kluge we can use instead: attach a no-op cast, so that the "*" isn't syntactically at top level anymore. This makes the printing of such whole-row Vars a lot more consistent with other Vars, and may indeed fix more cases than just the reported one; I'm suspicious that cases involving schema qualification probably didn't work properly before, either. Per bug report and fix proposal from Abbas Butt, though this patch is quite different in detail from his. Back-patch to all supported versions.	2012-04-27 19:49:46 -04:00
Tom Lane	f8d7f9adfd	Fix syslogger's rotation disable/re-enable logic. If it fails to open a new log file, the syslogger assumes there's something wrong with its parameters (such as log_directory), and stops attempting automatic time-based or size-based log file rotations. Sending it SIGHUP is supposed to start that up again. However, the original coding for that was really bogus, involving clobbering a couple of GUC variables and hoping that SIGHUP processing would restore them. Get rid of that technique in favor of maintaining a separate flag showing we've turned rotation off. Per report from Mark Kirkwood. Also, the syslogger will automatically attempt to create the log_directory directory if it doesn't exist, but that was only happening at startup. For consistency and ease of use, it should do the same whenever the value of log_directory is changed by SIGHUP. Back-patch to all supported branches.	2012-04-27 00:13:05 -04:00
Tom Lane	17fc5db766	Fix edge-case behavior of pg_next_dst_boundary(). Due to rather sloppy thinking (on my part, I'm afraid) about the appropriate behavior for boundary conditions, pg_next_dst_boundary() gave undefined, platform-dependent results when the input time is exactly the last recorded DST transition time for the specified time zone, as a result of fetching values one past the end of its data arrays. Change its specification to be that it always finds the next DST boundary after the input time, and adjust code to match that. The sole existing caller, DetermineTimeZoneOffset, doesn't actually care about this distinction, since it always uses a probe time earlier than the instant that it does care about. So it seemed best to me to change the API to make the result=1 and result=0 cases more consistent, specifically to ensure that the "before" outputs always describe the state at the given time, rather than hacking the code to obey the previous API comment exactly. Per bug #6605 from Sergey Burladyan. Back-patch to all supported versions.	2012-04-25 17:25:34 -04:00
Andrew Dunstan	062f113e52	Revert recent commit re positional arguments.	2012-04-18 10:56:16 -04:00
Robert Haas	cd3d613ffe	Fix copyfuncs/equalfuncs support for ReassignOwnedStmt. Noah Misch	2012-04-18 10:46:52 -04:00
Andrew Dunstan	7bebb85192	Don't override arguments set via options with positional arguments. A number of utility programs were rather careless about paremeters that can be set via both an option argument and a positional argument. This leads to results which can violate the Principal Of Least Astonishment. These changes refuse to use positional arguments to override settings that have been made via positional arguments. The changes are backpatched to all live branches.	2012-04-17 18:36:22 -04:00
Tom Lane	67a48385b5	Clamp indexscan filter condition cost estimate to be not less than zero. cost_index tries to estimate the per-tuple costs of evaluating filter conditions (a/k/a qpquals) by subtracting the estimated cost of the indexqual conditions from that of the baserestrictinfo conditions. This is correct so long as the indexquals list is a subset of the baserestrictinfo list. However, in the presence of derived indexable conditions it's completely wrong, leading to bogus or even negative scan cost estimates, as seen for example in bug #6579 from Istvan Endredy. In practice the problem isn't severe except in the specific case of a LIKE optimization on a functional index containing a very expensive function. A proper fix for this might change cost estimates by more than people would like for stable branches, so in the back branches let's just clamp the cost difference to be not less than zero. That will at least prevent completely insane behavior, while not changing the results normally.	2012-04-11 20:24:45 -04:00
Tom Lane	454c7fb3a3	Fix an Assert that turns out to be reachable after all. estimate_num_groups() gets unhappy with create table empty(); select * from empty except select * from empty e2; I can't see any actual use-case for such a query (and the table is illegal per SQL spec), but it seems like a good idea that it not cause an assert failure.	2012-04-09 11:59:17 -04:00
Heikki Linnakangas	2f8659b061	set_stack_base() no longer needs to be called in PostgresMain. This was a thinko in previous commit. Now that stack base pointer is now set in PostmasterMain and SubPostmasterMain, it doesn't need to be set in PostgresMain anymore.	2012-04-08 19:42:13 +03:00
Heikki Linnakangas	ddeac5dec1	Do stack-depth checking in all postmaster children. We used to only initialize the stack base pointer when starting up a regular backend, not in other processes. In particular, autovacuum workers can run arbitrary user code, and without stack-depth checking, infinite recursion in e.g an index expression will bring down the whole cluster. The comment about PL/Java using set_stack_base() is not yet true. As the code stands, PL/java still modifies the stack_base_ptr variable directly. However, it's been discussed in the PL/Java mailing list that it should be changed to use the function, because PL/Java is currently oblivious to the register stack used on Itanium. There's another issues with PL/Java, namely that the stack base pointer it sets is not really the base of the stack, it could be something close to the bottom of the stack. That's a separate issue that might need some further changes to this code, but that's a different story. Backpatch to all supported releases.	2012-04-08 19:09:37 +03:00
Tom Lane	1cba1142b3	Update URL for pgtclng project. Thom Brown	2012-04-06 19:00:34 -04:00
Tom Lane	e98fc8c426	Fix syslogger to not lose log coherency under high load. The original coding of the syslogger had an arbitrary limit of 20 large messages concurrently in progress, after which it would just punt and dump message fragments to the output file separately. Our ambitions are a bit higher than that now, so allow the data structure to expand as necessary. Reported and patched by Andrew Dunstan; some editing by Tom	2012-04-04 15:05:35 -04:00
Tom Lane	11efdb06ee	Fix a couple of contrib/dblink bugs. dblink_exec leaked temporary database connections if any error occurred after connection setup, for example SELECT dblink_exec('...connect string...', 'select 1/0'); Add a PG_TRY block to ensure PQfinish gets done when it is needed. (dblink_record_internal is on the hairy edge of needing similar treatment, but seems not to be actively broken at the moment.) Also, in 9.0 and up, only one of the three functions using tuplestore return mode was properly checking that the query context would allow a tuplestore result. Noted while reviewing dblink patch. Back-patch to all supported branches.	2012-04-03 20:43:35 -04:00
Tom Lane	b1be1294cc	Fix O(N^2) behavior in pg_dump when many objects are in dependency loops. Combining the loop workspace with the record of already-processed objects might have been a cute trick, but it behaves horridly if there are many dependency loops to repair: the time spent in the first step of findLoop() grows as O(N^2). Instead use a separate flag array indexed by dump ID, which we can check in constant time. The length of the workspace array is now never more than the actual length of a dependency chain, which should be reasonably short in all cases of practical interest. The code is noticeably easier to understand this way, too. Per gripe from Mike Roest. Since this is a longstanding performance bug, backpatch to all supported versions.	2012-03-31 15:51:27 -04:00
Tom Lane	55eb256789	Fix O(N^2) behavior in pg_dump for large numbers of owned sequences. The loop that matched owned sequences to their owning tables required time proportional to number of owned sequences times number of tables; although this work was only expended in selective-dump situations, which is probably why the issue wasn't recognized long since. Refactor slightly so that we can perform this work after the index array for findTableByOid has been set up, reducing the time to O(M log N). Per gripe from Mike Roest. Since this is a longstanding performance bug, backpatch to all supported versions.	2012-03-31 14:42:38 -04:00
Tom Lane	38350a4965	Fix GET DIAGNOSTICS for case of assignment to function's first variable. An incorrect and entirely unnecessary "safety check" in exec_stmt_getdiag() caused the code to treat an assignment to a variable with dno zero as a no-op. Unfortunately, that's a perfectly valid dno. This has been broken since GET DIAGNOSTICS was invented. It's not terribly surprising that the bug went unnoticed for so long, since in most cases you probably wouldn't use the function's first-created variable (normally its first parameter) as a GET DIAGNOSTICS target. Nonetheless, it's broken. Per bug #6551 from Adam Buraczewski.	2012-03-22 14:14:20 -04:00
Robert Haas	596e16320e	Don't allow CREATE TABLE AS to put relations in pg_global. This was never intended to be allowed, and is blocked for an ordinary CREATE TABLE, but CREATE TABLE AS slipped through the cracks. This commit won't do anything to fix existing cases where this has loophole has been exploited, but it still seems prudent to lock it down going forward. Back-branch commit only, as this problem has been refactored away on the master branch. Andres Freund	2012-03-21 12:39:24 -04:00
Alvaro Herrera	11d7d11e53	Update struct Trigger in docs	2012-03-20 13:15:01 -03:00
Andrew Dunstan	206f5b08ec	Honor inputdir and outputdir when converting regression files. When converting source files, pg_regress' inputdir and outputdir options were ignored when computing the locations of the destination files. In consequence, these options were effectively unusable when the regression inputs need to be adjusted by pg_regress. This patch makes pg_regress put the converted files in the same place that these options specify non-converted input or results files are to be found. Backpatched to all live branches.	2012-03-17 17:24:14 -04:00
Tatsuo Ishii	77397621d8	Add description for --no-locale and --text-search-config.	2012-03-11 20:12:42 +09:00
Tom Lane	3713ca86fb	Improve documentation around logging_collector and use of stderr. In backup.sgml, point out that you need to be using the logging collector if you want to log messages from a failing archive_command script. (This is an oversimplification, in that it will work without the collector as long as you're not sending postmaster stderr to /dev/null; but it seems like a good idea to encourage use of the collector to avoid problems with multiple processes concurrently scribbling on one file.) In config.sgml, do some wordsmithing of logging_collector discussion. Per bug #6518 from Janning Vygen	2012-03-05 14:09:10 -05:00
Tom Lane	82345d87c7	Stamp 8.3.18.	2012-02-23 18:01:58 -05:00
Tom Lane	ecabae5af9	Last-minute release note updates. Security: CVE-2012-0866, CVE-2012-0867, CVE-2012-0868	2012-02-23 17:48:18 -05:00
Tom Lane	a7f6cb8548	Convert newlines to spaces in names written in pg_dump comments. pg_dump was incautious about sanitizing object names that are emitted within SQL comments in its output script. A name containing a newline would at least render the script syntactically incorrect. Maliciously crafted object names could present a SQL injection risk when the script is reloaded. Reported by Heikki Linnakangas, patch by Robert Haas Security: CVE-2012-0868	2012-02-23 15:53:34 -05:00
Tom Lane	d1b8b8fbea	Require execute permission on the trigger function for CREATE TRIGGER. This check was overlooked when we added function execute permissions to the system years ago. For an ordinary trigger function it's not a big deal, since trigger functions execute with the permissions of the table owner, so they couldn't do anything the user issuing the CREATE TRIGGER couldn't have done anyway. However, if a trigger function is SECURITY DEFINER, that is not the case. The lack of checking would allow another user to install it on his own table and then invoke it with, essentially, forged input data; which the trigger function is unlikely to realize, so it might do something undesirable, for instance insert false entries in an audit log table. Reported by Dinesh Kumar, patch by Robert Haas Security: CVE-2012-0866	2012-02-23 15:39:20 -05:00
Peter Eisentraut	a930226c61	Translation updates	2012-02-23 20:28:42 +02:00
Tom Lane	c06598ce18	Draft release notes for 9.1.3, 9.0.7, 8.4.11, 8.3.18.	2012-02-22 18:12:05 -05:00
Tom Lane	2c293f2549	Don't clear btpo_cycleid during _bt_vacuum_one_page. When "vacuuming" a single btree page by removing LP_DEAD tuples, we are not actually within a vacuum operation, but rather in an ordinary insertion process that could well be running concurrently with a vacuum. So clearing the cycleid is incorrect, and could cause the concurrent vacuum to miss removing tuples that it needs to remove. This is a longstanding bug introduced by commit `e6284649b9` of 2006-07-25. I believe it explains Maxim Boguk's recent report of index corruption, and probably some other previously unexplained reports. In 9.0 and up this is a one-line fix; before that we need to introduce a flag to tell _bt_delitems what to do.	2012-02-21 15:04:01 -05:00
Magnus Hagander	f3ad4ca00e	Avoid double close of file handle in syslogger on win32 This causes an exception when running under a debugger or in particular when running on a debug version of Windows. Patch from MauMau	2012-02-21 17:14:36 +01:00
Tom Lane	246b85a948	Don't reject threaded Python on FreeBSD. According to Chris Rees, this has worked for awhile, and the current FreeBSD port is removing the test anyway.	2012-02-20 16:21:52 -05:00
Tom Lane	35ab4284b4	Fix regex back-references that are directly quantified with . The syntax "\n", that is a backref with a * quantifier directly applied to it, has never worked correctly in Spencer's library. This has been an open bug in the Tcl bug tracker since 2005: https://sourceforge.net/tracker/index.php?func=detail&aid=1115587&group_id=10894&atid=110894 The core of the problem is in parseqatom(), which first changes "\n" to "\n+\|" and then applies repeat() to the NFA representing the backref atom. repeat() thinks that any arc leading into its "rp" argument is part of the sub-NFA to be repeated. Unfortunately, since parseqatom() already created the arc that was intended to represent the empty bypass around "\n+", this arc gets moved too, so that it now leads into the state loop created by repeat(). Thus, what was supposed to be an "empty" bypass gets turned into something that represents zero or more repetitions of the NFA representing the backref atom. In the original example, in place of ^([bc])\1$ we now have something that acts like ^([bc])(\1+\|[bc])$ At runtime, the branch involving the actual backref fails, as it's supposed to, but then the other branch succeeds anyway. We could no doubt fix this by some rearrangement of the operations in parseqatom(), but that code is plenty ugly already, and what's more the whole business of converting "x" to "x+\|" probably needs to go away to fix another problem I'll mention in a moment. Instead, this patch suppresses the *-conversion when the target is a simple backref atom, leaving the case of m == 0 to be handled at runtime. This makes the patch in regcomp.c a one-liner, at the cost of having to tweak cbrdissect() a little. In the event I went a bit further than that and rewrote cbrdissect() to check all the string-length-related conditions before it starts comparing characters. It seems a bit stupid to possibly iterate through many copies of an n-character backreference, only to fail at the end because the target string's length isn't a multiple of n --- we could have found that out before starting. The existing coding could only be a win if integer division is hugely expensive compared to character comparison, but I don't know of any modern machine where that might be true. This does not fix all the problems with quantified back-references. In particular, the code is still broken for back-references that appear within a larger expression that is quantified (so that direct insertion of the quantification limits into the BACKREF node doesn't apply). I think fixing that will take some major surgery on the NFA code, specifically introducing an explicit iteration node type instead of trying to transform iteration into concatenation of modified regexps. Back-patch to all supported branches. In HEAD, also add a regression test case for this. (It may seem a bit silly to create a regression test file for just one test case; but I'm expecting that we will soon import a whole bunch of regex regression tests from Tcl, so might as well create the infrastructure now.)	2012-02-20 00:52:59 -05:00
Tom Lane	b0e1a4bd5e	Fix longstanding error in contrib/intarray's int[] & int[] operator. The array intersection code would give wrong results if the first entry of the correct output array would be "1". (I think only this value could be at risk, since the previous word would always be a lower-bound entry with that fixed value.) Problem spotted by Julien Rouhaud, initial patch by Guillaume Lelarge, cosmetic improvements by me.	2012-02-16 20:00:34 -05:00
Tom Lane	3eb2ff16db	Fix I/O-conversion-related memory leaks in plpgsql. Datatype I/O functions are allowed to leak memory in CurrentMemoryContext, since they are generally called in short-lived contexts. However, plpgsql calls such functions for purposes of type conversion, and was calling them in its procedure context. Therefore, any leaked memory would not be recovered until the end of the plpgsql function. If such a conversion was done within a loop, quite a bit of memory could get consumed. Fix by calling such functions in the transient "eval_econtext", and adjust other logic to match. Back-patch to all supported versions. Andres Freund, Jan Urbański, Tom Lane	2012-02-11 18:06:46 -05:00
Tom Lane	01f99e2d20	Fix brain fade in previous pg_dump patch. In pre-7.3 databases, pg_attribute.attislocal doesn't exist. The easiest way to make sure the new inheritance logic behaves sanely is to assume it's TRUE, not FALSE. This will result in printing child columns even when they're not really needed. We could work harder at trying to reconstruct a value for attislocal, but there is little evidence that anyone still cares about dumping from such old versions, so just do the minimum necessary to have a valid dump. I had this correct in the original draft of the patch, but for some unaccountable reason decided it wasn't necessary to change the value. Testing against an old server shows otherwise...	2012-02-10 14:09:42 -05:00
Tom Lane	02e6418181	Fix pg_dump for better handling of inherited columns. Revise pg_dump's handling of inherited columns, which was last looked at seriously in 2001, to eliminate several misbehaviors associated with inherited default expressions and NOT NULL flags. In particular make sure that a column is printed in a child table's CREATE TABLE command if and only if it has attislocal = true; the former behavior would sometimes cause a column to become marked attislocal when it was not so marked in the source database. Also, stop relying on textual comparison of default expressions to decide if they're inherited; instead, don't use default-expression inheritance at all, but just install the default explicitly at each level of the hierarchy. This fixes the search-path-related misbehavior recently exhibited by Chester Young, and also removes some dubious assumptions about the order in which ALTER TABLE SET DEFAULT commands would be executed. Back-patch to all supported branches.	2012-02-10 13:28:31 -05:00
Tom Lane	0f5d208253	Avoid problems with OID wraparound during WAL replay. Fix a longstanding thinko in replay of NEXTOID and checkpoint records: we tried to advance nextOid only if it was behind the value in the WAL record, but the comparison would draw the wrong conclusion if OID wraparound had occurred since the previous value. Better to just unconditionally assign the new value, since OID assignment shouldn't be happening during replay anyway. The consequences of a failure to update nextOid would be pretty minimal, since we have long had the code set up to obtain another OID and try again if the generated value is already in use. But in the worst case there could be significant performance glitches while such loops iterate through many already-used OIDs before finding a free one. The odds of a wraparound happening during WAL replay would be small in a crash-recovery scenario, and the length of any ensuing OID-assignment stall quite limited anyway. But neither of these statements hold true for a replication slave that follows a WAL stream for a long period; its behavior upon going live could be almost unboundedly bad. Hence it seems worth back-patching this fix into all supported branches. Already fixed in HEAD in commit `c6d76d7c82`.	2012-02-06 13:15:04 -05:00

1 2 3 4 5 ...

27144 Commits All Branches Search

27144 Commits

All Branches