postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	b09cb0cf12	Remove the pgstat_drop_relation() call from smgr_internal_unlink(), because we don't know at that point which relation OID to tell pgstat to forget. The code was passing the relfilenode, which is incorrect, and could possibly cause some other relation's stats to be zeroed out. While we could try to clean this up, it seems much simpler and more reliable to let the next invocation of pgstat_vacuum_tabstat() fix things; which indeed is how it worked before I introduced the buggy code into 8.1.3 and later :-(. Problem noticed by Itagaki Takahiro, fix is per subsequent discussion.	2007-07-08 22:23:16 +00:00
Tom Lane	48d9d8e131	Fix a couple of planner bugs introduced by the new ability to discard ORDER BY <constant> as redundant. One is that this means query_planner() has to canonicalize pathkeys even when the query jointree is empty; the canonicalization was always a no-op in such cases before, but no more. Also, we have to guard against thinking that a set-returning function is "constant" for this purpose. Add a couple of regression tests for these evidently under-tested cases. Per report from Greg Stark and subsequent experimentation.	2007-07-07 20:46:45 +00:00
Tom Lane	7af3a6fc6f	Fix up hash functions for datetime datatypes so that they don't take unwarranted liberties with int8 vs float8 values for these types. Specifically, be sure to apply either hashint8 or hashfloat8 depending on HAVE_INT64_TIMESTAMP. Per my gripe of even date.	2007-07-06 04:16:00 +00:00
Tom Lane	83aaebba63	Fix incorrect comment about the timing of AbsorbFsyncRequests() during checkpoint. The comment claimed that we could do this anytime after setting the checkpoint REDO point, but actually BufferSync is relying on the assumption that buffers dumped by other backends will be fsync'd too. So we really could not do it any sooner than we are doing it.	2007-07-03 14:51:24 +00:00
Neil Conway	a55898131e	Add ALTER VIEW ... RENAME TO, and a RENAME TO clause to ALTER SEQUENCE. Sequences and views could previously be renamed using ALTER TABLE, but this was a repeated source of confusion for users. Update the docs, and psql tab completion. Patch from David Fetter; various minor fixes by myself.	2007-07-03 01:30:37 +00:00
Tom Lane	1c7fe33fdb	Fix failure to restart Postgres when Linux kernel returns EIDRM for shmctl(). This is a Linux kernel bug that apparently exists in every extant kernel version: sometimes shmctl() will fail with EIDRM when EINVAL is correct. We were assuming that EIDRM indicates a possible conflict with pre-existing backends, and refusing to start the postmaster when this happens. Fortunately, there does not seem to be any case where Linux can legitimately return EIDRM (it doesn't track shmem segments in a way that would allow that), so we can get away with just assuming that EIDRM means EINVAL on this platform. Per reports from Michael Fuhr and Jon Lapham --- it's a bit surprising we have not seen more reports, actually.	2007-07-02 20:11:55 +00:00
Tom Lane	bce7bacdf2	Reduce the maximum sleep interval in the autovac launcher to 1 second, so that it responds to SIGQUIT reasonably promptly even on machines where SA_RESTART signals restart a sleep from scratch. (This whole area could stand some rethinking, but for now make it work like the other processes do.) Also some marginal stylistic cleanups.	2007-07-01 18:30:54 +00:00
Tom Lane	421d50273f	Treat the autovac launcher more like a regular backend, in that we wait for it to die before telling the bgwriter to initiate shutdown checkpoint. Since it's connected to shared memory, this seems more prudent than the alternative of letting it quit asynchronously. Resolves my complaint of yesterday about repeated shutdown checkpoints in CVS HEAD.	2007-07-01 18:28:41 +00:00
Tom Lane	8f55b9a8ba	Avoid memory leakage when a series of subtransactions invoke AFTER triggers that are fired at end-of-statement (as is the normal case for foreign keys, for example). In this situation the per-subxact deferred trigger context is always empty when subtransaction exit is reached; so we could free it, but were not doing so, leading to an intratransaction leak of 8K or more per subtransaction. Per off-list example from Viatcheslav Kalinin subsequent to bug #3418 (his original bug report omitted a foreign key constraint needed to cause this leak). Back-patch to 8.2; prior versions were not using per-subxact contexts for deferred triggers, so did not have this leak.	2007-07-01 17:45:42 +00:00
Tom Lane	beba73763b	Fix comments not updated in recent patch.	2007-07-01 02:22:23 +00:00
Tom Lane	070907b241	Add 'volatile' to suppress 'variable might be clobbered by longjmp' warning emitted by some versions of gcc.	2007-07-01 02:20:59 +00:00
Tom Lane	9fc25c0511	Improve logging of checkpoints. Patch by Greg Smith, worked over by Heikki and a little bit by me.	2007-06-30 19:12:02 +00:00
Alvaro Herrera	2910ccefb4	Avoid crash in interrupted autovacuum worker, caused by leaving the current memory context pointing at a context not long lived enough. Also, create a fake PortalContext where to store the vac_context, if only to avoid having it be a top-level memory context.	2007-06-30 04:08:05 +00:00
Alvaro Herrera	10af02b912	Arrange for SIGINT in autovacuum workers to cancel the current table and continue with the schedule. Change current uses of SIGINT to abort a worker into SIGTERM, which keeps the old behaviour of terminating the process. Patch from ITAGAKI Takahiro, with some editorializing of my own.	2007-06-29 17:07:39 +00:00
Tom Lane	6faf795662	Fix a passel of ancient bugs in to_char(), including two distinct buffer overruns (neither of which seem likely to be exploitable as security holes, fortunately, since the provoker can't control the data written). One of these is due to choosing to stomp on the output of a called function, which is bad news in any case; make it treat the called functions' results as read-only. Avoid some unnecessary palloc/pfree traffic too; it's not really helpful to free small temporary objects, and again this is presuming more than it ought to about the nature of the results of called functions. Per report from Patrick Welche and additional code-reading by Imad.	2007-06-29 01:51:35 +00:00
Tom Lane	867e2c91a0	Implement "distributed" checkpoints in which the checkpoint I/O is spread over a fairly long period of time, rather than being spat out in a burst. This happens only for background checkpoints carried out by the bgwriter; other cases, such as a shutdown checkpoint, are still done at full speed. Remove the "all buffers" scan in the bgwriter, and associated stats infrastructure, since this seems no longer very useful when the checkpoint itself is properly throttled. Original patch by Itagaki Takahiro, reworked by Heikki Linnakangas, and some minor API editorialization by me.	2007-06-28 00:02:40 +00:00
Alvaro Herrera	80f3b5ad2e	Remove unused "caller" argument from stringToQualifiedNameList.	2007-06-26 16:48:09 +00:00
Alvaro Herrera	bae0b56880	Improve autovacuum launcher's ability to detect a problem in worker startup, by having the postmaster signal it when certain failures occur. This requires the postmaster setting a flag in shared memory, but should be as safe as the pmsignal.c code is. Also make sure the launcher honor's a postgresql.conf change turning it off on SIGHUP.	2007-06-25 16:09:03 +00:00
Tom Lane	46379d6e60	Separate parse-analysis for utility commands out of parser/analyze.c (which now deals only in optimizable statements), and put that code into a new file parser/parse_utilcmd.c. This helps clarify and enforce the design rule that utility statements shouldn't be processed during the regular parse analysis phase; all interpretation of their meaning should happen after they are given to ProcessUtility to execute. (We need this because we don't retain any locks for a utility statement that's in a plan cache, nor have any way to detect that it's stale.) We are also able to simplify the API for parse_analyze() and related routines, because they will now always return exactly one Query structure. In passing, fix bug #3403 concerning trying to add a serial column to an existing temp table (this is largely Heikki's work, but we needed all that restructuring to make it safe).	2007-06-23 22:12:52 +00:00
Tom Lane	ba826299e0	Allow trailing whitespace in parse_real(), for consistency with parse_int() and with itself (strtod allows leading whitespace, so it seems odd not to allow trailing whitespace). parse_bool remains not-whitespace-friendly, but this is generically true for non-numeric GUC variables, so I'll desist from changing it.	2007-06-21 22:59:12 +00:00
Tom Lane	aa55d05571	Provide a HINT listing the allowed unit names when a GUC variable seems to contain a wrong unit specification, per discussion. In passing, fix the code to avoid unnecessary integer overflows when converting units, and to detect overflows when they do occur.	2007-06-21 18:14:21 +00:00
Tom Lane	6f0072df77	Restrict deadlock_timeout to the range for which the implementation actually works sanely, viz not 0 and not more than INT_MAX/1000 (else TimestampTzPlusMilliseconds can overflow). Per discussion with Greg Stark. Since this is a superuser-only setting and there was not previously any big reason to change it, not worth back-patching.	2007-06-20 18:31:39 +00:00
Tom Lane	cd407354ee	transformColumnDefinition failed to complain about create table foo (bar int default null default 3); due to not thinking about the special-case handling of DEFAULT NULL. Problem noticed while investigating bug #3396.	2007-06-20 18:21:00 +00:00
Tom Lane	a060d5ffdc	CREATE DOMAIN ... DEFAULT NULL failed because gram.y special-cases DEFAULT NULL and DefineDomain didn't. Bug goes all the way back to original coding of domains. Per bug #3396 from Sergey Burladyan.	2007-06-20 18:15:49 +00:00
Neil Conway	c1d89c61fc	Minor code cleanup: calling FreeFile() before ereport(ERROR) is not necessary, since files opened via AllocateFile() are closed automatically as part of error recovery.	2007-06-20 02:02:49 +00:00
Tom Lane	9cce91dba0	Only log 'process acquired lock' if we actually did get the lock. This test seems inessential right now since the only control path for not getting the lock is via CHECK_FOR_INTERRUPTS which won't return control to ProcSleep, but it would be important if we ever allow the deadlock code to kill someone else's transaction instead of our own.	2007-06-19 22:01:15 +00:00
Neil Conway	ec4595dae1	Remove duplicate #include.	2007-06-19 21:24:48 +00:00
Tom Lane	6e07228728	Code review for log_lock_waits patch. Don't try to issue log messages from within a signal handler (this might be safe given the relatively narrow code range in which the interrupt is enabled, but it seems awfully risky); do issue more informative log messages that tell what is being waited for and the exact length of the wait; minor other code cleanup. Greg Stark and Tom Lane	2007-06-19 20:13:22 +00:00
Tom Lane	4c310eca2e	Arrange for quote_identifier() and pg_dump to not quote keywords that are unreserved according to the grammar. The list of unreserved words has gotten extensive enough that the unnecessary quoting is becoming a bit of an eyesore. To do this, add knowledge of the keyword category to keywords.c's table. (Someday we might be able to generate keywords.c's table and the keyword lists in gram.y from a common source.) For the moment, lie about WITH's status in the table so it will still get quoted --- this is because of the expectation that WITH will become reserved when the SQL recursive-queries patch gets done. I didn't force initdb because this affects nothing on-disk; but note that a few regression tests have changed expected output.	2007-06-18 21:40:58 +00:00
Magnus Hagander	532834081d	Remove comment about modifying tab-complete.c for userset GUC. Simon Riggs	2007-06-18 10:02:57 +00:00
Tom Lane	de6a6383a7	Update obsolete comment: it's no longer the case that mdread() will allow reads beyond EOF, except by special coercion.	2007-06-18 00:47:20 +00:00
Tom Lane	011b51cb7e	Marginal hacking to improve the speed of COPY OUT. I had found in a bit of profiling that CopyAttributeOutText was taking an unreasonable fraction of the backend run time (like 66%!) on the following trivial test case: $ time psql -c "copy (select repeat('xyzzy',50) from generate_series(1,10000000)) to stdout" regression >/dev/null The time is all being spent on scanning the string for characters to be escaped, which most of the time there aren't any of. Some tweaking to take as many tests as possible out of the inner loop reduced the runtime of this example by more than 10%. In a real-world case it wouldn't be as useful a speedup, but it still seems worth adding a few lines here.	2007-06-17 23:39:28 +00:00
Tom Lane	6775c01080	Revert an ill-considered portion of my patch of 12-Mar, which tried to save a few lines in sql_exec_error_callback() by using the function source string field that the patch added to SQL function cache entries. This doesn't work because the fn_extra field isn't filled in yet during init_sql_fcache(). Probably it could be made to work, but it doesn't seem appropriate to contort the main code paths to make an error-reporting path a tad faster. Per report from Pavel Stehule.	2007-06-17 18:57:29 +00:00
Tom Lane	23347231a5	Tweak the API for per-datatype typmodin functions so that they are passed an array of strings rather than an array of integers, and allow any simple constant or identifier to be used in typmods; for example create table foo (f1 widget(42,'23skidoo',point)); Of course the typmodin function has still got to pack this info into a non-negative int32 for storage, but it's still a useful improvement in flexibility, especially considering that you can do nearly anything if you are willing to keep the info in a side table. We can get away with this change since we have not yet released a version providing user-definable typmods. Per discussion.	2007-06-15 20:56:52 +00:00
Alvaro Herrera	bd06ab29ae	Avoid having autovacuum run multiple ANALYZE commands in a single transaction, to prevent possible deadlock problems. Per request from Tom Lane.	2007-06-14 13:53:14 +00:00
Andrew Dunstan	bd2cb9aaa5	Implement a chunking protocol for writes to the syslogger pipe, with messages reassembled in the syslogger before writing to the log file. This prevents partial messages from being written, which mucks up log rotation, and messages from different backends being interleaved, which causes garbled logs. Backport as far as 8.0, where the syslogger was introduced. Tom Lane and Andrew Dunstan	2007-06-14 01:48:51 +00:00
Alvaro Herrera	a0a26c47d4	Avoid integer overflow issues in autovacuum.	2007-06-13 21:24:56 +00:00
Tom Lane	e976fd43c6	Add some simple defenses against null fields in pg_largeobject, and add comments noting that there's an alignment assumption now that the data field could be in 1-byte-header format. Per discussion with Greg Stark.	2007-06-12 19:46:24 +00:00
Tom Lane	152133bfaf	Add some comments about the safety of accessing rolpassword without using the normal heap_getattr() machinery. Per Greg Stark.	2007-06-12 17:16:52 +00:00
Tom Lane	d0599994da	Fix DecodeDateTime to allow timezone to appear before year. This had historically worked in some but not all cases, but as of 8.2 it failed for all timezone formats. Fix, and add regression test cases to catch future regressions in this area. Per gripe from Adam Witney.	2007-06-12 15:58:32 +00:00
Tom Lane	a9545b3aef	Improve UPDATE/DELETE WHERE CURRENT OF so that they can be used from plpgsql with a plpgsql-defined cursor. The underlying mechanism for this is that the main SQL engine will now take "WHERE CURRENT OF $n" where $n is a refcursor parameter. Not sure if we should document that fact or consider it an implementation detail. Per discussion with Pavel Stehule.	2007-06-11 22:22:42 +00:00
Tom Lane	6808f1b1de	Support UPDATE/DELETE WHERE CURRENT OF cursor_name, per SQL standard. Along the way, allow FOR UPDATE in non-WITH-HOLD cursors; there may once have been a reason to disallow that, but it seems to work now, and it's really rather necessary if you want to select a row via a cursor and then update it in a concurrent-safe fashion. Original patch by Arul Shaji, rather heavily editorialized by Tom Lane.	2007-06-11 01:16:30 +00:00
Tom Lane	85d72f0516	Teach heapam code to know the difference between a real seqscan and the pseudo HeapScanDesc created for a bitmap heap scan. This avoids some useless overhead during a bitmap scan startup, in particular invoking the syncscan code. (We might someday want to do that, but right now it's merely useless contention for shared memory, to say nothing of possibly pushing useful entries out of syncscan's small LRU list.) This also allows elimination of ugly pgstat_discount_heap_scan() kluge.	2007-06-09 18:49:55 +00:00
Tom Lane	e17e40f783	Allow numeric_fac() to be interrupted, since it can take quite a while for large inputs. Also cause it to error out immediately if the result will overflow, instead of grinding through a lot of calculation first. Per gripe from Jim Nasby.	2007-06-09 15:52:30 +00:00
Alvaro Herrera	a4d5872719	Disallow the cost balancing code from resulting in a zero cost limit, which causes a division-by-zero error in the vacuum code. This can happen when there are more workers than cost limit units. Per report from Galy Lee in <200705310914.l4V9E6JA094603@wwwmaster.postgresql.org>.	2007-06-08 21:21:28 +00:00
Alvaro Herrera	2b438c12cc	Avoid passing zero as a value for vacuum_cost_limit, because it's not a valid value for the vacuum code. Instead, make zero signify getting the value from a higher level configuration facility, just like -1 in the original coding. We still document that -1 is the value that disables the feature, to avoid confusing the user unnecessarily. Reported by Galy Lee in <200705310914.l4V9E6JA094603@wwwmaster.postgresql.org>; per subsequent discussion.	2007-06-08 21:09:49 +00:00
Tom Lane	a04a423599	Arrange for large sequential scans to synchronize with each other, so that when multiple backends are scanning the same relation concurrently, each page is (ideally) read only once. Jeff Davis, with review by Heikki and Tom.	2007-06-08 18:23:53 +00:00
Tom Lane	6d6d14b6d5	Redefine IsTransactionState() to only return true for TRANS_INPROGRESS state, which is the only state in which it's safe to initiate database queries. It turns out that all but two of the callers thought that's what it meant; and the other two were using it as a proxy for "will GetTopTransactionId() return a nonzero XID"? Since it was in fact an unreliable guide to that, make those two just invoke GetTopTransactionId() always, then deal with a zero result if they get one.	2007-06-07 21:45:59 +00:00
Tom Lane	24ee8af573	Rework temp_tablespaces patch so that temp tablespaces are assigned separately for each temp file, rather than once per sort or hashjoin; this allows spreading the data of a large sort or join across multiple tablespaces. (I remain dubious that this will make any difference in practice, but certain people insisted.) Arrange to cache the results of parsing the GUC variable instead of recomputing from scratch on every demand, and push usage of the cache down to the bottommost fd.c level.	2007-06-07 19:19:57 +00:00
Alvaro Herrera	2d9d7a6bf5	Avoid losing track of data for shared tables in pgstats. Report by Michael Fuhr, patch from Tom Lane after a messier suggestion by me.	2007-06-07 18:53:17 +00:00
Tom Lane	2d4db3675f	Fix up text concatenation so that it accepts all the reasonable cases that were accepted by prior Postgres releases. This takes care of the loose end left by the preceding patch to downgrade implicit casts-to-text. To avoid breaking desirable behavior for array concatenation, introduce a new polymorphic pseudo-type "anynonarray" --- the added concatenation operators are actually text \|\| anynonarray and anynonarray \|\| text.	2007-06-06 23:00:50 +00:00
Tom Lane	7dab4f75ca	Minor editorialization: don't flush plan cache without need.	2007-06-05 21:50:19 +00:00
Tom Lane	31edbadf4a	Downgrade implicit casts to text to be assignment-only, except for the ones from the other string-category types; this eliminates a lot of surprising interpretations that the parser could formerly make when there was no directly applicable operator. Create a general mechanism that supports casts to and from the standard string types (text,varchar,bpchar) for every datatype, by invoking the datatype's I/O functions. These new casts are assignment-only in the to-string direction, explicit-only in the other, and therefore should create no surprising behavior. Remove a bunch of thereby-obsoleted datatype-specific casting functions. The "general mechanism" is a new expression node type CoerceViaIO that can actually convert between any two datatypes if their external text representations are compatible. This is more general than needed for the immediate feature, but might be useful in plpgsql or other places in future. This commit does nothing about the issue that applying the concatenation operator \|\| to non-text types will now fail, often with strange error messages due to misinterpreting the operator as array concatenation. Since it often (not always) worked before, we should either make it succeed or at least give a more user-friendly error; but details are still under debate. Peter Eisentraut and Tom Lane	2007-06-05 21:31:09 +00:00
Jan Wieck	1120b99445	The session_replication_role actually can be changed at will during a session regardless of the existence of cached plans. The plancache only needs to be invalidated so that rules affected by the new setting will be reflected in the new query plans. Jan	2007-06-05 20:00:41 +00:00
Teodor Sigaev	f74426283d	Move call of MarkBufferDirty() before XLogInsert() as required. Many thanks to Heikki Linnakangas <heikki@enterprisedb.com> for his sharp eyes.	2007-06-05 12:47:49 +00:00
Andrew Dunstan	4c0fe51279	Remove ill-conceived CRLF translation for Windows in syslogger.	2007-06-04 22:21:42 +00:00
Teodor Sigaev	853d1c3103	Fix bundle bugs of GIN: - Fix possible deadlock between UPDATE and VACUUM queries. Bug never was observed in 8.2, but it still exist there. HEAD is more sensitive to bug after recent "ring" of buffer improvements. - Fix WAL creation: if parent page is stored as is after split then incomplete split isn't removed during replay. This happens rather rare, only on large tables with a lot of updates/inserts. - Fix WAL replay: there was wrong test of XLR_BKP_BLOCK_* for left page after deletion of page. That causes wrong rightlink field: it pointed to deleted page. - add checking of match of clearing incomplete split - cleanup incomplete split list after proceeding All of this chages doesn't change on-disk storage, so backpatch... But second point may be an issue for replaying logs from previous version.	2007-06-04 15:56:28 +00:00
Magnus Hagander	aae5403278	On win32, retry reading when WSARecv returns WSAEWOULDBLOCK. There seem to be cases when at least Windows 2000 can do this even though select just indicated that the socket is readable. Per report and analysis from Cyril VELTER.	2007-06-04 13:39:28 +00:00
Magnus Hagander	0e92f9813e	On win32, don't use SO_REUSEADDR for TCP sockets. Per failure on buildfarm member baiji and subsequent discussion.	2007-06-04 11:59:20 +00:00
Peter Eisentraut	f4a3789b39	Clarify some error messages about duplicate things.	2007-06-03 22:16:03 +00:00
Tom Lane	acfce502ba	Create a GUC parameter temp_tablespaces that allows selection of the tablespace(s) in which to store temp tables and temporary files. This is a list to allow spreading the load across multiple tablespaces (a random list element is chosen each time a temp object is to be created). Temp files are not stored in per-database pgsql_tmp/ directories anymore, but per-tablespace directories. Jaime Casanova and Albert Cervera, with review by Bernd Helmle and Tom Lane.	2007-06-03 17:08:34 +00:00
Peter Eisentraut	5d429f8d88	Minimal message corrections found by spell checker.	2007-06-02 23:36:35 +00:00
Tom Lane	376ee15033	Fix erroneous error reporting for overlength input in text_date(), text_time(), and text_timetz(). 7.4-vintage bug found by Greg Stark.	2007-06-02 16:41:09 +00:00
Andrew Dunstan	15f8202c20	Improve efficiency of LIKE/ILIKE code, especially for multi-byte charsets, and most especially for UTF8. Remove unnecessary special cases for bytea processing and single-byte charset ILIKE. a ILIKE b is now processed as lower(a) LIKE lower(b) in all cases. The code is now considerably simpler. All comparisons are now performed byte-wise, and the text and pattern are also advanced byte-wise where it is safe to do so - essentially where a wildcard is not being matched. Andrew Dunstan, from an original patch by ITAGAKI Takahiro, with ideas from Tom Lane and Mark Mielke.	2007-06-02 02:03:42 +00:00
Tom Lane	964ec46cfe	Fix aboriginal bug in BufFileDumpBuffer that would cause it to write the wrong data when dumping a bufferload that crosses a component-file boundary. This probably has not been seen in the wild because (a) component files are normally 1GB apiece and (b) non-block-aligned buffer usage is relatively rare. But it's fairly easy to reproduce a problem if one reduces RELSEG_SIZE in a test build. Kudos to Kurt Harriman for spotting the bug.	2007-06-01 23:43:11 +00:00
Neil Conway	f086be3d39	Allow leading and trailing whitespace in the input to the boolean type. Also, add explicit casts between boolean and text/varchar. Both of these changes are for conformance with SQL:2003. Update the regression tests, bump the catversion.	2007-06-01 23:40:19 +00:00
Tom Lane	bd0a260928	Make CREATE/DROP/RENAME DATABASE wait a little bit to see if other backends will exit before failing because of conflicting DB usage. Per discussion, this seems a good idea to help mask the fact that backend exit takes nonzero time. Remove a couple of thereby-obsoleted sleeps in contrib and PL regression test sequences.	2007-06-01 19:38:07 +00:00
Tom Lane	bd2c980b22	Buy back some of the cycles spent in more-expensive hash functions by selecting power-of-2, rather than prime, numbers of buckets in hash joins. If the hash functions are doing their jobs properly by making all hash bits equally random, this is good enough, and it saves expensive integer division and modulus operations.	2007-06-01 17:38:44 +00:00
Tom Lane	1f559b7d3a	Fix several hash functions that were taking chintzy shortcuts instead of delivering a well-randomized hash value. I got religion on this after observing that performance of multi-batch hash join degrades terribly if the higher-order bits of hash values aren't random, as indeed was true for say hashes of small integer values. It's now expected and documented that hash functions should use hash_any or some comparable method to ensure that all bits of their output are about equally random. initdb forced because this change invalidates existing hash indexes. For the same reason, this isn't back-patchable; the hash join performance problem will get a band-aid fix in the back branches.	2007-06-01 15:33:19 +00:00
Tom Lane	cc3e9deee6	The shortcut exit that I recently added to ExecInitIndexScan() for EXPLAIN-only operation was a little too short; it skipped initializing the node's result tuple type, which may be needed depending on what's above the indexscan node. Call ExecAssignResultTypeFromTL before exiting. (For good luck I moved up the ExecAssignScanProjectionInfo call as well, so that everything except indexscan-specific initialization will still be done.) Per example from Grant Finnemore.	2007-05-31 20:45:26 +00:00
Tom Lane	10f719af33	Change build_index_pathkeys() so that the expressions it builds to represent index key columns always have the type expected by the index's associated operators, ie, we add RelabelType nodes when dealing with binary-compatible index opclasses. This is needed to get varchar indexes to play nicely with the new EquivalenceClass machinery, as per recent gripe from Josh Berkus that CVS HEAD was failing to match a varchar index column to a constant restriction in the query. It seems likely that this change will allow removal of a lot of ugly ad-hoc RelabelType-stripping that the planner has traditionally done while matching expressions to other expressions, but I'll worry about that some other day.	2007-05-31 16:57:34 +00:00
Peter Eisentraut	7ce9b3683e	Make some messages more consistent	2007-05-31 15:13:06 +00:00
Teodor Sigaev	54af876593	Replace ReadBuffer to ReadBufferWithStrategy in all vacuum-involved places to implement limited-size "ring" of buffers for VACUUM for GIN & GIST	2007-05-31 14:03:09 +00:00
Peter Eisentraut	71fb7b9014	Downgrade some low-level startup messages to DEBUG1.	2007-05-31 07:36:12 +00:00
Tom Lane	fa0e318f94	Fix overly-strict sanity check in BeginInternalSubTransaction that made it fail when used in a deferred trigger. Bug goes back to 8.0; no doubt the reason it hadn't been noticed is that we've been discouraging use of user-defined constraint triggers. Per report from Frank van Vugt.	2007-05-30 21:01:39 +00:00
Tom Lane	d526575f89	Make large sequential scans and VACUUMs work in a limited-size "ring" of buffers, rather than blowing out the whole shared-buffer arena. Aside from avoiding cache spoliation, this fixes the problem that VACUUM formerly tended to cause a WAL flush for every page it modified, because we had it hacked to use only a single buffer. Those flushes will now occur only once per ring-ful. The exact ring size, and the threshold for seqscans to switch into the ring usage pattern, remain under debate; but the infrastructure seems done. The key bit of infrastructure is a new optional BufferAccessStrategy object that can be passed to ReadBuffer operations; this replaces the former StrategyHintVacuum API. This patch also changes the buffer usage-count methodology a bit: we now advance usage_count when first pinning a buffer, rather than when last unpinning it. To preserve the behavior that a buffer's lifetime starts to decrease when it's released, the clock sweep code is modified to not decrement usage_count of pinned buffers. Work not done in this commit: teach GiST and GIN indexes to use the vacuum BufferAccessStrategy for vacuum-driven fetches. Original patch by Simon, reworked by Heikki and again by Tom.	2007-05-30 20:12:03 +00:00
Neil Conway	f14f27dd38	Tweak: use memcpy() in text_time(), rather than manually copying bytes in a loop.	2007-05-30 19:38:05 +00:00
Neil Conway	6af04882de	Fix a bug in input processing for the "interval" type. Previously, "microsecond" and "millisecond" units were not considered valid input by themselves, which caused inputs like "1 millisecond" to be rejected erroneously. Update the docs, add regression tests, and backport to 8.2 and 8.1	2007-05-29 04:58:43 +00:00
Neil Conway	e78720ff2f	mmgr README tweak: "either" is no longer correct. The previous wording compared PortalContext with QueryContext, but the latter no longer exists.	2007-05-29 04:19:35 +00:00
Tom Lane	fa98a86f65	Tweak the code in a couple of places to try to deliver more user-friendly error messages when a single COPY line is too long for us to handle. Per example from Johann Spies.	2007-05-28 16:43:24 +00:00
Neil Conway	f505edace1	Code cleanup: use "bool" for Boolean variables, rather than "int".	2007-05-27 20:32:16 +00:00
Tom Lane	97d12b434f	Ooops, I was too busy worrying about getting the transactional infrastructure right to think carefully about how insert and delete counts map to n_live_tuples. Of course a deletion should reduce n_live_tuples.	2007-05-27 17:28:36 +00:00
Tom Lane	8d675c85c5	pgstat's on-proc-exit hook has to execute after the last transaction commit or abort within a backend; rearrange InitPostgres processing to make it so. Revealed by just-added Asserts along with ECPG regression tests (hm, I wonder why the core regression tests didn't expose it?). This possibly is another reason for missing stats updates ...	2007-05-27 05:37:50 +00:00
Tom Lane	77947c51c0	Fix up pgstats counting of live and dead tuples to recognize that committed and aborted transactions have different effects; also teach it not to assume that prepared transactions are always committed. Along the way, simplify the pgstats API by tying counting directly to Relations; I cannot detect any redeeming social value in having stats pointers in HeapScanDesc and IndexScanDesc structures. And fix a few corner cases in which counts might be missed because the relation's pgstat_info pointer hadn't been set.	2007-05-27 03:50:39 +00:00
Tom Lane	cadb78330e	Repair two constraint-exclusion corner cases triggered by proving that an inheritance child of an UPDATE/DELETE target relation can be excluded by constraints. I had rearranged some code in set_append_rel_pathlist() to avoid "useless" work when a child is excluded, but overdid it and left the child with no cheapest_path entry, causing possible failure later if the appendrel was involved in a join. Also, it seems that the dummy plan generated by inheritance_planner() when all branches are excluded has to be a bit less dummy now than was required in 8.2. Per report from Jan Wieck. Add his test case to the regression tests.	2007-05-26 18:23:02 +00:00
Tom Lane	604ffd280b	Create hooks to let a loadable plugin monitor (or even replace) the planner and/or create plans for hypothetical situations; in particular, investigate plans that would be generated using hypothetical indexes. This is a heavily-rewritten version of the hooks proposed by Gurjeet Singh for his Index Advisor project. In this formulation, the index advisor can be entirely a loadable module instead of requiring a significant part to be in the core backend, and plans can be generated for hypothetical indexes without requiring the creation and rolling-back of system catalog entries. The index advisor patch as-submitted is not compatible with these hooks, but it needs significant work anyway due to other 8.2-to-8.3 planner changes. With these hooks in the core backend, development of the advisor can proceed as a pgfoundry project.	2007-05-25 17:54:25 +00:00
Tom Lane	ce5b24abed	Remove ruleutils.c's use of varnoold/varoattno as a shortcut for determining what a Var node refers to. This is no longer necessary because the new flat-range-table representation of plan trees makes it relatively easy to dig down through child plan levels to find the original reference; and to keep doing it that way, we'd have to store joinaliasvars lists in flattened RTEs, as demonstrated by bug report from Leszek Trenkner. This change makes varnoold/varoattno truly just debug aids, which wasn't quite the case before. Perhaps we should drop them, or only have them in assert-enabled builds?	2007-05-24 18:58:42 +00:00
Tom Lane	11086f2f2b	Repair planner bug introduced in 8.2 by ability to rearrange outer joins: in cases where a sub-SELECT inserts a WHERE clause between two outer joins, that clause may prevent us from re-ordering the two outer joins. The code was considering only the joins' own ON-conditions in determining reordering safety, which is not good enough. Add a "delay_upper_joins" flag to OuterJoinInfo to flag that we have detected such a clause and higher-level outer joins shouldn't be permitted to commute with this one. (This might seem overly coarse, but given the current rules for OJ reordering, it's sufficient AFAICT.) The failure case is actually pretty narrow: it needs a WHERE clause within the RHS of a left join that checks the RHS of a lower left join, but is not strict for that RHS (else we'd have simplified the lower join to a plain join). Even then no failure will be manifest unless the planner chooses to rearrange the join order. Per bug report from Adam Terrey.	2007-05-22 23:23:58 +00:00
Tom Lane	d7153c5fad	Fix best_inner_indexscan to return both the cheapest-total-cost and cheapest-startup-cost innerjoin indexscans, and make joinpath.c consider both of these (when different) as the inside of a nestloop join. The original design was based on the assumption that indexscan paths always have negligible startup cost, and so total cost is the only important figure of merit; an assumption that's obviously broken by bitmap indexscans. This oversight could lead to choosing poor plans in cases where fast-start behavior is more important than total cost, such as LIMIT and IN queries. 8.1-vintage brain fade exposed by an example from Chuck D.	2007-05-22 01:40:33 +00:00
Tom Lane	2415ad9831	Teach tuplestore.c to throw away data before the "mark" point when the caller is using mark/restore but not rewind or backward-scan capability. Insert a materialize plan node between a mergejoin and its inner child if the inner child is a sort that is expected to spill to disk. The materialize shields the sort from the need to do mark/restore and thereby allows it to perform its final merge pass on-the-fly; while the materialize itself is normally cheap since it won't spill to disk unless the number of tuples with equal key values exceeds work_mem. Greg Stark, with some kibitzing from Tom Lane.	2007-05-21 17:57:35 +00:00
Peter Eisentraut	3963574d13	XPath fixes: - Function renamed to "xpath". - Function is now strict, per discussion. - Return empty array in case when XPath expression detects nothing (previously, NULL was returned in such case), per discussion. - (bugfix) Work with fragments with prologue: select xpath('/a', '<?xml version="1.0"?><a /><b />'); // now XML datum is always wrapped with dummy <x>...</x>, XML prologue simply goes away (if any). - Some cleanup. Nikolay Samokhvalov Some code cleanup and documentation work by myself.	2007-05-21 17:10:29 +00:00
Tom Lane	a8d539f124	To support external compression of archived WAL data, add a flag bit to WAL records that shows whether it is safe to remove full-page images (ie, whether or not an on-line backup was in progress when the WAL entry was made). Also make provision for an XLOG_NOOP record type that can be used to fill in the extra space when decompressing the data for restore. This is the portion of Koichi Suzuki's "full page writes" patch that has to go into the core database. The remainder of that work is two external compression and decompression programs, which for the time being will undergo separate development on pgfoundry. Per discussion. Also, twiddle the handling of BTREE_SPLIT records to ensure it'll be possible to compress them (the previous coding caused essential info to be omitted). The other commonly-used record types seem OK already, with the possible exception of GIN and GIST WAL records, which I don't understand well enough to opine on.	2007-05-20 21:08:19 +00:00
Alvaro Herrera	e18ca9bbaa	Fix dumb compile error in the last patch.	2007-05-19 01:02:34 +00:00
Alvaro Herrera	b40776d221	Have CLUSTER advance the table's relfrozenxid. The new frozen point is the FreezeXid introduced in a recent commit, so there isn't any data loss in this approach. Doing it causes ALTER TABLE (or rather, the forms of it that cause a full table rewrite) to be affected as well. In this case, the frozen point is RecentXmin, because after the rewrite all the tuples are relabeled with the rewriting transaction's Xid. TOAST tables are fixed automatically as well, as fallout of the way they were already being handled in the respective code paths. With this patch, there is no longer need to VACUUM tables for Xid wraparound purposes that have been cleaned up via TRUNCATE or CLUSTER.	2007-05-18 23:19:42 +00:00
Tom Lane	d1972c52a8	Remove redundant logging of send failures when SSL is in use. While pqcomm.c had been taught not to do that ages ago, the SSL code was helpfully bleating anyway. Resolves some recent reports such as bug #3266; however the underlying cause of the related bug #2829 is still unclear.	2007-05-18 01:20:16 +00:00
Tom Lane	dbb769352d	Temporary fix for the problem that pg_stat_activity, inet_client_addr(), and inet_server_addr() fail if the client connected over a "scoped" IPv6 address. In this case getnameinfo() will return a string ending with a poorly-standardized "%something" zone specifier, which these functions try to feed to network_in(), which won't take it. So that we don't lose functionality altogether, suppress the zone specifier before giving the string to network_in(). Per report from Brian Hirt. TODO: probably someday the inet type should support scoped IPv6 addresses, and then this patch should be reverted. Backpatch to 8.2 ... is it worth going further?	2007-05-17 23:31:49 +00:00
Tom Lane	b11123b675	Fix parameter recalculation for Limit nodes: during a ReScan call we must recompute the limit/offset immediately, so that the updated values are available when the child's ReScan function is invoked. Add a regression test for this, too. Bug is new in HEAD (due to the bounded-sorting patch) so no need for back-patch. I did not do anything about merging this signaling with chgParam processing, but if we were to do that we'd still need to compute the updated values at this point rather than during the first ProcNode call. Per observation and test case from Greg Stark, though I didn't use his patch.	2007-05-17 19:35:08 +00:00
Alvaro Herrera	3b0347b36e	Move the tuple freezing point in CLUSTER to a point further back in the past, to avoid losing useful Xid information in not-so-old tuples. This makes CLUSTER behave the same as VACUUM as far a tuple-freezing behavior goes (though CLUSTER does not yet advance the table's relfrozenxid). While at it, move the actual freezing operation in rewriteheap.c to a more appropriate place, and document it thoroughly. This part of the patch from Tom Lane.	2007-05-17 15:28:29 +00:00
Alvaro Herrera	90cbc63fd1	Have TRUNCATE advance the affected table's relfrozenxid to RecentXmin, to avoid a later needless VACUUM for Xid-wraparound purposes. We can do this since the table is known to be left empty, so no Xid remains on it. Per discussion.	2007-05-16 17:28:20 +00:00
Alvaro Herrera	dfed0012bc	Have the rewriteheap code freeze old tuples. This is safe because it is only applied to live tuples older than a recent Xmin, not to tuples that may be part of an update chain. Those still keep their original markings. This patch makes it possible for CLUSTER to advance relfrozenxid, thus avoiding the need of vacuuming the table for Xid wraparound purposes. That will be patched separately. Patch from Heikki Linnakangas.	2007-05-16 16:36:56 +00:00
Tom Lane	0a9cbcbfd2	Get rid of the pg_shdepend entry for a TOAST table; it's unnecessary since there's an indirect dependency on the owner via the parent table. We were already handling indexes that way, but not toast tables for some reason. Saves a little catalog space and cuts down the verbosity of checkSharedDependencies reports.	2007-05-14 20:24:41 +00:00
Tom Lane	2b321533f3	Fix up grammar and translatability of recent checkSharedDependencies patch; also make the code logic a bit more self-consistent.	2007-05-14 20:07:01 +00:00
Tom Lane	fd53a67dcd	Prevent RevalidateCachedPlan from making any permanent change in ActiveSnapshot. Having it affect ActiveSnapshot only in the unusual case of needing to replan seems a bad idea, and there's also the problem that the created snap might be in a relatively short-lived context, as noted by Jan Wieck. Also, there's no need to force a new snap at all unless we are called with no snap currently set, which is an unusual case in itself.	2007-05-14 18:13:21 +00:00
Alvaro Herrera	689dea424d	Report all dependent objects to the server log when a shared object is dropped, and only a truncated log of the objects in the current database to the client. Also, instead of reporting object counts for all databases on which the user might own objects, report only as many as fit in the predefined line count. This is to avoid flooding the client when the user owns too many objects, which could cause problems. Per report from Ed L. on April 4th and subsequent discussion.	2007-05-14 16:50:36 +00:00
Tom Lane	1856e609ec	Improve predicate_refuted_by_simple_clause() to handle IS NULL and IS NOT NULL more completely. The motivation for having it understand IS NULL at all was to allow use of "foo IS NULL" as one of the subsets of a partitioning on "foo", but as reported by Aleksander Kmetec, it wasn't really getting the job done. Backpatch to 8.2 since this is arguably a performance bug.	2007-05-12 19:22:35 +00:00
Tom Lane	9aa3c782c9	Fix the problem that creating a user-defined type named _foo, followed by one named foo, would work but the other ordering would not. If a user-specified type or table name collides with an existing auto-generated array name, just rename the array type out of the way by prepending more underscores. This should not create any backward-compatibility issues, since the cases in which this will happen would have failed outright in prior releases. Also fix an oversight in the arrays-of-composites patch: ALTER TABLE RENAME renamed the table's rowtype but not its array type.	2007-05-12 00:55:00 +00:00
Tom Lane	d8326119c8	Fix my oversight in enabling domains-of-domains: ALTER DOMAIN ADD CONSTRAINT needs to check the new constraint against columns of derived domains too. Also, make it error out if the domain to be modified is used within any composite-type columns. Eventually we should support that case, but it seems a bit painful, and not suitable for a back-patch. For the moment just let the user know we can't do it. Backpatch to 8.2, which is the only released version that allows nested domains. Possibly the other part should be back-patched further.	2007-05-11 20:17:15 +00:00
Tom Lane	bc8036fc66	Support arrays of composite types, including the rowtypes of regular tables and views (but not system catalogs, nor sequences or toast tables). Get rid of the hardwired convention that a type's array type is named exactly "_type", instead using a new column pg_type.typarray to provide the linkage. (It still will be named "_type", though, except in odd corner cases such as maximum-length type names.) Along the way, make tracking of owner and schema dependencies for types more uniform: a type directly created by the user has these dependencies, while a table rowtype or auto-generated array type does not have them, but depends on its parent object instead. David Fetter, Andrew Dunstan, Tom Lane	2007-05-11 17:57:14 +00:00
Neil Conway	ade493e02d	Add a hash function for "numeric". Mark the equality operator for numerics as "oprcanhash", and make the corresponding system catalog updates. As a result, hash indexes, hashed aggregation, and hash joins can now be used with the numeric type. Bump the catversion. The only tricky aspect to doing this is writing a correct hash function: it's possible for two Numerics to be equal according to their equality operator, but have different in-memory bit patterns. To cope with this, the hash function doesn't consider the Numeric's "scale" or "sign", and explictly skips any leading or trailing zeros in the Numeric's digit buffer (the current implementation should suppress any such zeros, but it seems unwise to rely upon this). See discussion on pgsql-patches for more details.	2007-05-08 18:56:48 +00:00
Peter Eisentraut	3b4f9fe5d2	The appended patch addresses the outstanding issues of the recent guc patch. It makes PGCLIENTENCODING work again and uses bsearch() instead of iterating over the array of guc variables in guc_get_index(). Joachim Wieland	2007-05-08 16:33:51 +00:00
Alvaro Herrera	067deaf83d	Make sure we don't skip databases that are supposed to be vacuumed "exactly now". This can happen if the time granularity is not very high. Per ITAGAKI Takahiro.	2007-05-07 20:41:24 +00:00
Magnus Hagander	343a9a27a9	Check return code from strxfrm on Windows since it has a non-standard way of indicating errors, so we don't try to allocate INT_MAX bytes to store a result in.	2007-05-05 17:05:48 +00:00
Tom Lane	d2a4a4069f	Add a line to the EXPLAIN ANALYZE output for a Sort node, showing the actual sort strategy and amount of space used. By popular demand.	2007-05-04 21:29:53 +00:00
Tom Lane	fab789eac9	Suppress a recently-introduced 'variable might be clobbered by longjmp' warning.	2007-05-04 02:06:13 +00:00
Tom Lane	79ca7ffeb6	A few fixups in error handling: mark pg_re_throw() as noreturn for gcc, and for other compilers, insert a dummy exit() call so that they understand PG_RE_THROW() doesn't return. Insert fflush(stderr) in ExceptionalCondition, per recent buildfarm evidence that that might not happen automatically on some platforms. And const-ify ExceptionalCondition's declaration while at it.	2007-05-04 02:01:02 +00:00
Tom Lane	d26559dbf3	Teach tuplesort.c about "top N" sorting, in which only the first N tuples need be returned. We keep a heap of the current best N tuples and sift-up new tuples into it as we scan the input. For M input tuples this means only about Mlog(N) comparisons instead of Mlog(M), not to mention a lot less workspace when N is small --- avoiding spill-to-disk for large M is actually the most attractive thing about it. Patch includes planner and executor support for invoking this facility in ORDER BY ... LIMIT queries. Greg Stark, with some editorialization by moi.	2007-05-04 01:13:45 +00:00
Tom Lane	0fef38da21	Tweak hash index AM to use the new ReadOrZeroBuffer bufmgr API when fetching pages it intends to zero immediately. Just to show there is some use for that function besides WAL recovery :-). Along the way, fold _hash_checkpage and _hash_pageinit calls into _hash_getbuf and friends, instead of expecting callers to do that separately.	2007-05-03 16:45:58 +00:00
Tom Lane	63735ca815	Dept. of second thoughts: add comments cautioning against using ReadOrZeroBuffer to fetch pages from beyond physical EOF. This would usually work, but would cause problems for md.c if writes occurred beyond a segment boundary when the previous segment file hadn't been fully extended.	2007-05-02 23:34:48 +00:00
Tom Lane	8c3cc86e7b	During WAL recovery, when reading a page that we intend to overwrite completely from the WAL data, don't bother to physically read it; just have bufmgr.c return a zeroed-out buffer instead. This speeds recovery significantly, and also avoids unnecessary failures when a page-to-be-overwritten has corrupt page headers on disk. This replaces a former kluge that accomplished the latter by pretending zero_damaged_pages was always ON during WAL recovery; which was OK when the kluge was put in, but is unsafe when restoring a WAL log that was written with full_page_writes off. Heikki Linnakangas	2007-05-02 23:18:03 +00:00
Tom Lane	8ec943856a	Fix things so that when CREATE INDEX CONCURRENTLY sets pg_index.indisvalid true at the very end of its processing, the update is broadcast via a shared-cache-inval message for the index; without this, existing backends that already have relcache entries for the index might never see it become valid. Also, force a relcache inval on the index's parent table at the same time, so that any cached plans for that table are re-planned; this ensures that the newly valid index will be used if appropriate. Aside from making C.I.C. behave more reasonably, this is necessary infrastructure for some aspects of the HOT patch. Pavan Deolasee, with a little further stuff from me.	2007-05-02 21:08:46 +00:00
Alvaro Herrera	229d33801d	Use the new TimestampDifferenceExceeds API instead of timestamp_cmp_internal and TimestampDifference, to make coding clearer. I think this should also fix the failure to start workers in platforms with low resolution timers, as reported by Itagaki Takahiro.	2007-05-02 18:27:57 +00:00
Alvaro Herrera	a115bfe3b9	Fix failure to check for INVALID worker entry in the new autovacuum code, which could happen when a worker took to long to start and was thus "aborted" by the launcher. Noticed by lionfish buildfarm member.	2007-05-02 15:47:14 +00:00
Tom Lane	88f1fd2989	Fix oversight in PG_RE_THROW processing: it's entirely possible that there isn't any place to throw the error to. If so, we should treat the error as FATAL, just as we would have if it'd been thrown outside the PG_TRY block to begin with. Although this is clearly a potential source of bugs, it is not clear at the moment whether it is an actual source of bugs; there may not presently be any PG_TRY blocks in code that can be reached with no outer longjmp catcher. So for the moment I'm going to be conservative and not back-patch this. The change breaks ABI for users of PG_RE_THROW and hence might create compatibility problems for loadable modules, so we should not put it into released branches without proof that it's needed.	2007-05-02 15:32:42 +00:00
Tom Lane	b4349519c1	Fix a thinko in my patch of a couple months ago for bug #3116 : it did the wrong thing when inlining polymorphic SQL functions, because it was using the function's declared return type where it should have used the actual result type of the current call. In 8.1 and 8.2 this causes obvious failures even if you don't have assertions turned on; in 8.0 and 7.4 it would only be a problem if the inlined expression were used as an input to a function that did run-time type determination on its inputs. Add a regression test, since this is evidently an under-tested area.	2007-05-01 18:53:52 +00:00
Tom Lane	c432061963	Change the timestamps recorded in transaction commit/abort xlog records from time_t to TimestampTz representation. This provides full gettimeofday() resolution of the timestamps, which might be useful when attempting to do point-in-time recovery --- previously it was not possible to specify the stop point with sub-second resolution. But mostly this is to get rid of TimestampTz-to-time_t conversion overhead during commit. Per my proposal of a day or two back.	2007-04-30 21:01:53 +00:00
Tom Lane	641912b4d1	Fix oversight in my patch of yesterday: forgot to ensure that stats would still be forced out at backend exit.	2007-04-30 16:37:08 +00:00
Tom Lane	957d08c81f	Implement rate-limiting logic on how often backends will attempt to send messages to the stats collector. This avoids the problem that enabling stats_row_level for autovacuum has a significant overhead for short read-only transactions, as noted by Arjen van der Meijden. We can avoid an extra gettimeofday call by piggybacking on the one done for WAL-logging xact commit or abort (although that doesn't help read-only transactions, since they don't WAL-log anything). In my proposal for this, I noted that we could change the WAL log entries for commit/abort to record full TimestampTz precision, instead of only time_t as at present. That's not done in this patch, but will be committed separately.	2007-04-30 03:23:49 +00:00
Tom Lane	57b82bf324	Marginal performance hack: use a dedicated routine instead of copyObject to copy nodes that are known to be Vars during plan reference adjustment. Saves useless memzero operation as well as the big switch in copyObject.	2007-04-30 00:16:43 +00:00
Tom Lane	afaa6b9821	Marginal performance hack: avoid unnecessary work in expression_tree_mutator. We can just palloc, instead of using makeNode, when we are going to overwrite the whole node anyway in the FLATCOPY macro. Also, use FLATCOPY instead of copyObject for common node types Var and Const.	2007-04-30 00:14:54 +00:00
Tom Lane	39a333aa2b	Marginal performance hack: remove the loop that used to be needed to look through a freelist for a chunk of adequate size. For a long time now, all elements of a given freelist have been exactly the same allocated size, so we don't need a loop. Since the loop never iterated more than once, you'd think this wouldn't matter much, but it makes a noticeable savings in a simple test --- perhaps because the compiler isn't optimizing on a mistaken assumption that the loop would repeat. AllocSetAlloc is called often enough that saving even a couple of instructions is worthwhile.	2007-04-30 00:12:08 +00:00
Tom Lane	bbbe825f5f	Modify processing of DECLARE CURSOR and EXPLAIN so that they can resolve the types of unspecified parameters when submitted via extended query protocol. This worked in 8.2 but I had broken it during plancache changes. DECLARE CURSOR is now treated almost exactly like a plain SELECT through parse analysis, rewrite, and planning; only just before sending to the executor do we divert it away to ProcessUtility. This requires a special-case check in a number of places, but practically all of them were already special-casing SELECT INTO, so it's not too ugly. (Maybe it would be a good idea to merge the two by treating IntoClause as a form of utility statement? Not going to worry about that now, though.) That approach doesn't work for EXPLAIN, however, so for that I punted and used a klugy solution of running parse analysis an extra time if under extended query protocol.	2007-04-27 22:05:49 +00:00
Tom Lane	a2e923a652	Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scan is in progress on the same hashtable. This seems the least invasive way to fix the recently-recognized problem that a split could cause the scan to visit entries twice or (with much lower probability) miss them entirely. The only field-reported problem caused by this is the "failed to re-find shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky, which was caused by multiply visited entries. However, it seems certain that mdsync() is vulnerable to missing required fsync's due to missed entries, and I am fearful that RelationCacheInitializePhase2() might be at risk as well. Because of that and the generalized hazard presented by this bug, back-patch all the supported branches. Along the way, fix pg_prepared_statement() and pg_cursor() to not assume that the hashtables they are examining will stay static between calls. This is risky regardless of the newly noted dynahash problem, because hash_seq_search() has never promised to cope with deletion of table entries other than the just-returned one. There may be no bug here because the only supported way to call these functions is via ExecMakeTableFunctionResult() which will cycle them to completion before doing anything very interesting, but it seems best to get rid of the assumption. This affects 8.2 and HEAD only, since those functions weren't there earlier.	2007-04-26 23:24:46 +00:00
Neil Conway	16efdb5ec7	Rename the newly-added commands for discarding session state. RESET SESSION, RESET PLANS, and RESET TEMP are now DISCARD ALL, DISCARD PLANS, and DISCARD TEMP, respectively. This is to avoid confusion with the pre-existing RESET variants: the DISCARD commands are not actually similar to RESET. Patch from Marko Kreen, with some minor editorialization.	2007-04-26 16:13:15 +00:00
Magnus Hagander	93dc5a234e	Set maximum semaphore count to 32767 instead of 1. Fixes errorcode 298 when unlocking a semaphore more than once. Per report from Marcin Waldowski.	2007-04-24 12:25:18 +00:00
Tom Lane	dbcd9d6160	Remove some of the most blatant brain-fade in the recent guc patch (it's so nice to have a buildfarm member that actively rejects naked uses of strcasecmp). This coding is still pretty awful, though, since it's going to be O(N^2) in the number of guc variables. May I direct your attention to bsearch?	2007-04-22 03:52:40 +00:00
Tom Lane	afcf09dd90	Some further performance tweaks for planning large inheritance trees that are mostly excluded by constraints: do the CE test a bit earlier to save some adjust_appendrel_attrs() work on excluded children, and arrange to use array indexing rather than rt_fetch() to fetch RTEs in the main body of the planner. The latter is something I'd wanted to do for awhile anyway, but seeing list_nth_cell() as 35% of the runtime gets one's attention.	2007-04-21 21:01:45 +00:00
Peter Eisentraut	b7edb568bd	Make configuration parameters fall back to their default values when they are removed from the configuration file. Joachim Wieland	2007-04-21 20:02:41 +00:00
Tom Lane	48239e156f	Avoid useless work during set_plain_rel_pathlist() when the relation will be excluded by constraint exclusion anyway. Greg Stark	2007-04-21 06:18:52 +00:00
Tom Lane	925ca9d7de	Tweak make_inh_translation_lists() to check the common case wherein parent and child attnums are the same, before it grovels through each and every child column looking for a name match. Saves some time in large inheritance trees, per example from Greg.	2007-04-21 05:56:41 +00:00
Tom Lane	402bd494ce	Improve the way in which CatalogCacheComputeHashValue combines multiple key values: don't throw away perfectly good hash bits, and increase the shift distances so as to provide more separation in the common case where some of the key values are small integers (and so their hashes are too, because hashfunc.c doesn't try all that hard). This reduces the runtime of SearchCatCache by a factor of 4 in an example provided by Greg Stark, in which the planner spends a whole lot of time searching the two-key STATRELATT cache. It seems unlikely to hurt in other cases, but maybe we could do even better?	2007-04-21 04:49:20 +00:00
Tom Lane	11da4c671e	Adjust pgstat_initstats() to avoid repeated searches of the TabStat arrays when a relation is opened multiple times in the same transaction. This is particularly useful for system catalogs, which we may heap_open or index_open many times in a transaction, and it doesn't really cost anything extra even if the rel is touched but once. Motivated by study of an example from Greg Stark, in which pgstat_initstats() accounted for an unreasonably large fraction of the runtime.	2007-04-21 04:10:53 +00:00
Tom Lane	ca3d14f2a9	Tweak set_rel_width() to avoid redundant executions of getrelid(). In very large queries this accounts for a noticeable fraction of planning time. Per an example from Greg Stark.	2007-04-21 02:41:13 +00:00
Bruce Momjian	1c8302cab3	Add comment on why deadlock detection error messages only prints numbers.	2007-04-20 20:15:52 +00:00
Tom Lane	aa27977fe2	Support explicit placement of the temporary-table schema within search_path. This is needed to allow a security-definer function to set a truly secure value of search_path. Without it, a malicious user can use temporary objects to execute code with the privileges of the security-definer function. Even pushing the temp schema to the back of the search path is not quite good enough, because a function or operator at the back of the path might still capture control from one nearer the front due to having a more exact datatype match. Hence, disable searching the temp schema altogether for functions and operators. Security: CVE-2007-2138	2007-04-20 02:37:38 +00:00
Tom Lane	9d37c038fc	Repair PANIC condition in hash indexes when a previous index extension attempt failed (due to lock conflicts or out-of-space). We might have already extended the index's filesystem EOF before failing, causing the EOF to be beyond what the metapage says is the last used page. Hence the invariant maintained by the code needs to be "EOF is at or beyond last used page", not "EOF is exactly the last used page". Problem was created by my patch of 2006-11-19 that attempted to repair bug #2737. Since that was back-patched to 7.4, this needs to be as well. Per report and test case from Vlastimil Krejcir.	2007-04-19 20:24:04 +00:00
Alvaro Herrera	dfa58878cb	Silence compiler warnings, per Bruce.	2007-04-19 16:26:44 +00:00
Alvaro Herrera	ef23a77441	Enable configurable log of autovacuum actions. Initial patch from Simon Riggs, additional code and docs by me. Per discussion.	2007-04-18 16:44:18 +00:00
Bruce Momjian	c228448910	Update docs/error message for CSV quote/escape --- must be ASCII. Backpatch doc change to 8.2.X.	2007-04-18 02:28:22 +00:00
Bruce Momjian	4029a5af9b	Update error message for COPY with a multi-byte delimiter.	2007-04-18 00:38:57 +00:00
Tom Lane	836feeda9c	Fix condition for whether end_heap_rewrite must fsync, per Heikki.	2007-04-17 21:29:31 +00:00

1 2 3 4 5 ...

9173 Commits