postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	4adc2f72a4	Change hash indexes to store only the hash code rather than the whole indexed value. This means that hash index lookups are always lossy and have to be rechecked when the heap is visited; however, the gain in index compactness outweighs this when the indexed values are wide. Also, we only need to perform datatype comparisons when the hash codes match exactly, rather than for every entry in the hash bucket; so it could also win for datatypes that have expensive comparison functions. A small additional win is gained by keeping hash index pages sorted by hash code and using binary search to reduce the number of index tuples we have to look at. Xiao Meng This commit also incorporates Zdenek Kotala's patch to isolate hash metapages and hash bitmaps a bit better from the page header datastructures.	2008-09-15 18:43:41 +00:00
Magnus Hagander	9872381090	Parse pg_hba.conf in postmaster, instead of once in each backend for each connection. This makes it possible to catch errors in the pg_hba file when it's being reloaded, instead of silently reloading a broken file and failing only when a user tries to connect. This patch also makes the "sameuser" argument to ident authentication optional.	2008-09-15 12:32:57 +00:00
Tom Lane	bf0b6ac43c	Skip opfamily check in eclass_matches_any_index() when the index isn't a btree. We can't easily tell whether clauses generated from the equivalence class could be used with such an index, so just assume that they might be. This bit of over-optimization prevented use of non-btree indexes for nestloop inner indexscans, in any case where the join uses an equality operator that is also a btree operator --- which in particular is typically true for hash indexes. Noted while trying to test the current hash index patch.	2008-09-12 14:56:13 +00:00
Tom Lane	06edce4c3f	Tighten up to_date/to_timestamp so that they are more likely to reject erroneous input, rather than silently producing bizarre results as formerly happened. Brendan Jurd	2008-09-11 17:32:34 +00:00
Tom Lane	70530c808b	Adjust the parser to accept the typename syntax INTERVAL ... SECOND(n) and the literal syntax INTERVAL 'string' ... SECOND(n), as required by the SQL standard. Our old syntax put (n) directly after INTERVAL, which was a mistake, but will still be accepted for backward compatibility as well as symmetry with the TIMESTAMP cases. Change intervaltypmodout to show it in the spec's way, too. (This could potentially affect clients, if there are any that analyze the typmod of an INTERVAL in any detail.) Also fix interval input to handle 'min:sec.frac' properly; I had overlooked this case in my previous patch. Document the use of the interval fields qualifier, which up to now we had never mentioned in the docs. (I think the omission was intentional because it didn't work per spec; but it does now, or at least close enough to be credible.)	2008-09-11 15:27:30 +00:00
Alvaro Herrera	d53a56687f	Initialize the minimum frozen Xid in vac_update_datfrozenxid using GetOldestXmin() instead of RecentGlobalXmin; this is safer because we do not depend on the latter being correctly set elsewhere, and while it is more expensive, this code path is not performance-critical. This is a real risk for autovacuum, because it can execute whole cycles without doing a single vacuum, which would mean that RecentGlobalXmin would stay at its initialization value, FirstNormalTransactionId, causing a bogus value to be inserted in pg_database. This bug could explain some recent reports of failure to truncate pg_clog. At the same time, change the initialization of RecentGlobalXmin to InvalidTransactionId, and ensure that it's set to something else whenever it's going to be used. Using it as FirstNormalTransactionId in HOT page pruning could incur in data loss. InitPostgres takes care of setting it to a valid value, but the extra checks are there to prevent "special" backends from behaving in unusual ways. Per Tom Lane's detailed problem dissection in 29544.1221061979@sss.pgh.pa.us	2008-09-11 14:01:10 +00:00
Tom Lane	b8646012d5	Tweak newly added set_config_sourcefile() so that the target record isn't left corrupt if guc_strdup should fail.	2008-09-10 19:16:22 +00:00
Tom Lane	f867339c01	Make our parsing of INTERVAL literals spec-compliant (or at least a heck of a lot closer than it was before). To do this, tweak coerce_type() to pass through the typmod information when invoking interval_in() on an UNKNOWN constant; then fix DecodeInterval to pay attention to the typmod when deciding how to interpret a units-less integer value. I changed one or two other details as well. I believe the code now reacts as expected by spec for all the literal syntaxes that are specifically enumerated in the spec. There are corner cases involving strings that don't exactly match the set of fields called out by the typmod, for which we might want to tweak the behavior some more; but I think this is an area of user friendliness rather than spec compliance. There remain some non-compliant details about the SQL syntax (as opposed to what's inside the literal string); but at least we'll throw error rather than silently doing the wrong thing in those cases.	2008-09-10 18:29:41 +00:00
Alvaro Herrera	3b9ec4682c	Add "source file" and "source line" information to each GUC variable. initdb forced due to changes in the pg_settings view. Magnus Hagander and Alvaro Herrera.	2008-09-10 18:09:20 +00:00
Tom Lane	ee33b95d9c	Improve the plan cache invalidation mechanism to make it invalidate plans when user-defined functions used in a plan are modified. Also invalidate plans when schemas, operators, or operator classes are modified; but for these cases we just invalidate everything rather than tracking exact dependencies, since these types of objects seldom change in a production database. Tom Lane; loosely based on a patch by Martin Pihlak.	2008-09-09 18:58:09 +00:00
Tom Lane	ead21631e8	Fix a couple of problems pointed out by Fujii Masao in the 2008-Apr-05 patch for pg_stop_backup. First, it is possible that the history file name is not alphabetically later than the last WAL file name, so we should explicitly check that both have been archived. Second, the previous coding would wait forever if a checkpoint had managed to remove the WAL file before we look for it. Simon Riggs, plus some code cleanup by me.	2008-09-08 16:42:15 +00:00
Tom Lane	a0b76dc662	Create a separate grantable privilege for TRUNCATE, rather than having it be always owner-only. The TRUNCATE privilege works identically to the DELETE privilege so far as interactions with the rest of the system go. Robert Haas	2008-09-08 00:47:41 +00:00
Tom Lane	a26c7e3d71	Support set-returning functions in the target lists of Agg and Group plan nodes. This is a pretty ugly feature but since we don't yet have a plausible substitute, we'd better support it everywhere. Per gripe from Jeff Davis.	2008-09-08 00:22:56 +00:00
Tom Lane	e6a310b281	Reimplement text_position and related functions to use Boyer-Moore-Horspool searching instead of naive matching. In the worst case this has the same O(M*N) complexity as the naive method, but the worst case is hard to hit, and the average case is very fast, especially with longer patterns. David Rowley	2008-09-07 04:20:00 +00:00
Tom Lane	409c144d83	Adjust psql's new \ef command to present an empty CREATE FUNCTION template for editing if no function name is specified. This seems a much cleaner way to offer that functionality than the original patch had. In passing, de-clutter the error displays that are given for a bogus function-name argument, and standardize on "$function$" as the default delimiter for the function body. (The original coding would use the shortest possible dollar-quote delimiter, which seems to create unnecessarily high risk of later conflicts with the user-modified function body.)	2008-09-06 20:18:08 +00:00
Tom Lane	2c863ca818	Implement a psql command "\ef" to edit the definition of a function. In support of that, create a backend function pg_get_functiondef(). The psql command is functional but maybe a bit rough around the edges... Abhijit Menon-Sen	2008-09-06 00:01:25 +00:00
Tom Lane	e540b97248	Fix an oversight in the 8.2 patch that improved mergejoin performance by inserting a materialize node above an inner-side sort node, when the sort is expected to spill to disk. (The materialize protects the sort from having to support mark/restore, allowing it to do its final merge pass on-the-fly.) We neglected to teach cost_mergejoin about that hack, so it was failing to include the materialize's costs in the estimated cost of the mergejoin. The materialize's costs are generally going to be pretty negligible in comparison to the sort's, so this is only a small error and probably not worth back-patching; but it's still wrong. In the similar case where a materialize is inserted to protect an inner-side node that can't do mark/restore at all, it's still true that the materialize should not spill to disk, and so we should cost it cheaply rather than expensively. Noted while thinking about a question from Tom Raney.	2008-09-05 21:07:29 +00:00
Peter Eisentraut	11f53b1063	Code coverage testing with gcov. Documentation is in the regression test chapter. Author: Michelle Caisse <Michelle.Caisse@Sun.COM>	2008-09-05 12:11:18 +00:00
Teodor Sigaev	5373817cf2	Fix strategy propagation to scanEntry for partial match by moving propagation to initializaion of scanEntry.	2008-09-04 11:47:05 +00:00
Tom Lane	ba9f37f066	If a loadable module has wrong values in its magic block, spell out exactly what they are in the complaint message. Marko Kreen, some editorialization by me.	2008-09-03 22:34:50 +00:00
Tom Lane	fbb2b69c8f	Prevent memory leaks in our various bison parsers when an error occurs during parsing. Formerly the parser's stack was allocated with malloc and so wouldn't be reclaimed; this patch makes it use palloc instead, so that flushing the current context will reclaim the memory. Per Marko Kreen.	2008-09-02 20:37:55 +00:00
Tom Lane	b153c09209	Add a bunch of new error location reports to parse-analysis error messages. There are still some weak spots around JOIN USING and relation alias lists, but most errors reported within backend/parser/ now have locations.	2008-09-01 20:42:46 +00:00
Heikki Linnakangas	9ac4299163	HeapTupleHeaderAdjustCmax made the incorrect assumption that the raw command id is the cmin, when it can in fact be a combo cid. That made rows incorrectly invisible to a transaction where a tuple was deleted by multiple aborted subtransactions. Report and patch Karl Schnaitter. Back-patch to 8.3, where combo cids was introduced.	2008-09-01 18:52:45 +00:00
Tom Lane	449a00fbbd	Fix the raw-parsetree representation of star (as in SELECT * FROM or SELECT foo.) so that it cannot be confused with a quoted identifier "". Instead create a separate node type A_Star to represent this notation. Per pgsql-hackers discussion of 2007-Sep-27.	2008-08-30 01:39:14 +00:00
Tom Lane	6253f9de67	In GCC-based builds, use a better newNode() macro that relies on GCC-specific syntax to avoid a useless store into a global variable. Per experimentation, this works better than my original thought of trying to push the code into an out-of-line subroutine.	2008-08-29 22:49:07 +00:00
Tom Lane	4571185111	Suppress gcc warning about possibly-uninitialized variable. It's not clear to me why I'd not seen this message before --- on F-9 it seems to only happen if Asserts are disabled, which ought to be irrelevant. Maybe that affects a decision whether to inline get_ten(), which would be needed to expose the warning condition to the compiler? Anyway, the fix is clear.	2008-08-29 16:34:14 +00:00
Peter Eisentraut	7c31742a07	Remove all traces that suggest that a non-Bison yacc might be supported, and change build system to use only Bison. Simplify build rules, make file names uniform. Don't build the token table header file where it is not needed.	2008-08-29 13:02:33 +00:00
Tom Lane	a2794623d2	Extend the parser location infrastructure to include a location field in most node types used in expression trees (both before and after parse analysis). This allows us to place an error cursor in many situations where we formerly could not, because the information wasn't available beyond the very first level of parse analysis. There's a fair amount of work still to be done to persuade individual ereport() calls to actually include an error location, but this gets the initdb-forcing part of the work out of the way; and the situation is already markedly better than before for complaints about unimplementable implicit casts, such as CASE and UNION constructs with incompatible alternative data types. Per my proposal of a few days ago.	2008-08-28 23:09:48 +00:00
Tom Lane	6734182c16	Teach eval_const_expressions() to simplify an ArrayCoerceExpr to a constant when its input is constant and the element coercion function is immutable (or nonexistent, ie, binary-coercible case). This is an oversight in the 8.3 implementation of ArrayCoerceExpr, and its result is that certain cases involving IN or NOT IN with constants don't get optimized as they should be. Per experimentation with an example from Ow Mun Heng.	2008-08-26 02:16:31 +00:00
Tom Lane	e5536e77a5	Move exprType(), exprTypmod(), expression_tree_walker(), and related routines into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside the backend. There's probably more that should be done along this line, but this is a start anyway.	2008-08-25 22:42:34 +00:00
Tom Lane	d320101b5b	Get rid of the last remaining uses of var_is_rel(), to wit some debugging checks in ExecIndexBuildScanKeys() that were inadequate anyway: it's better to verify the correct varno on an expected index key, not just reject OUTER and INNER. This makes the entire current contents of nodeFuncs.c dead code. I'll be replacing it with some other stuff later, as per recent proposal.	2008-08-25 20:20:30 +00:00
Magnus Hagander	f1e237b6b2	Unconditionally write the statsfile when SIGHUP is received, to minimize the window during which backends have no statistics file to read.	2008-08-25 18:55:43 +00:00
Alvaro Herrera	d96d7be2b5	Update URL to Ross William's paper. Devrim Gunduz.	2008-08-25 17:37:40 +00:00
Magnus Hagander	be8d6c5c34	Make stats_temp_directory PGC_SIGHUP, and document how it may cause a temporary "outage" of the statistics views. This requires making the stats collector respond to SIGHUP, like the other utility processes already did.	2008-08-25 15:11:01 +00:00
Magnus Hagander	8c032adec4	Convert remaining builtin set-returning functions to use OUT parameters, making it possible to call them without specifying a column list. Jaime Casanova	2008-08-25 11:18:43 +00:00
Bruce Momjian	31ad4e5396	Add missing descriptions for aggregates, functions and conversions. Bernd Helmle	2008-08-23 20:31:37 +00:00
Teodor Sigaev	1dcf6fdf1b	Fix possible duplicate tuples while GiST scan. Now page is processed at once and ItemPointers are collected in memory. Remove tuple's killing by killtuple() if tuple was moved to another page - it could produce unaceptable overhead. Backpatch up to 8.1 because the bug was introduced by GiST's concurrency support.	2008-08-23 10:37:24 +00:00
Bruce Momjian	8ddb739e9d	Make "log_temp_files" super-user set only, like other logging options. Simon Riggs	2008-08-22 18:47:07 +00:00
Bruce Momjian	6152de97d3	Minor patch on pgbench 1. -i option should run vacuum analyze only on pgbench tables, not all tables in database. 2. pre-run cleanup step was DELETE FROM HISTORY then VACUUM HISTORY. This is just a slow version of TRUNCATE HISTORY. Simon Riggs	2008-08-22 17:57:34 +00:00
Bruce Momjian	03302fd9b4	Improve wording of error message when a postgresql.conf setting is ignored because it can only be set at server start.	2008-08-22 00:20:40 +00:00
Tom Lane	bd3daddaf2	Arrange to convert EXISTS subqueries that are equivalent to hashable IN subqueries into the same thing you'd have gotten from IN (except always with unknownEqFalse = true, so as to get the proper semantics for an EXISTS). I believe this fixes the last case within CVS HEAD in which an EXISTS could give worse performance than an equivalent IN subquery. The tricky part of this is that if the upper query probes the EXISTS for only a few rows, the hashing implementation can actually be worse than the default, and therefore we need to make a cost-based decision about which way to use. But at the time when the planner generates plans for subqueries, it doesn't really know how many times the subquery will be executed. The least invasive solution seems to be to generate both plans and postpone the choice until execution. Therefore, in a query that has been optimized this way, EXPLAIN will show two subplans for the EXISTS, of which only one will actually get executed. There is a lot more that could be done based on this infrastructure: in particular it's interesting to consider switching to the hash plan if we start out using the non-hashed plan but find a lot more upper rows going by than we expected. I have therefore left some minor inefficiencies in place, such as initializing both subplans even though we will currently only use one.	2008-08-22 00:16:04 +00:00
Tom Lane	cc0dd43850	Marginal improvement in sublink planning: allow unknownEqFalse optimization to be used for SubLinks that are underneath a top-level OR clause. Just as at the very top level of WHERE, it's not necessary to be accurate about whether the sublink returns FALSE or NULL, because either result has the same impact on whether the WHERE will succeed.	2008-08-20 19:58:24 +00:00
Tom Lane	390e59cd5f	Fix obsolete comment. It's no longer the case that Param nodes don't carry typmod.	2008-08-20 15:49:30 +00:00
Tom Lane	9650830bc8	Cause the output from debug_print_parse, debug_print_rewritten, and debug_print_plan to appear at LOG message level, not DEBUG1 as historically. Make debug_pretty_print default to on. Also, cause plans generated via EXPLAIN to be subject to debug_print_plan. This is all to make debug_print_plan a reasonably comfortable substitute for the former behavior of EXPLAIN VERBOSE.	2008-08-19 18:30:04 +00:00
Tom Lane	719012e013	Add some defenses against constant-FALSE outer join conditions. Since eval_const_expressions will generally throw away anything that's ANDed with constant FALSE, what we're left with given an example like select * from tenk1 a where (unique1,0) in (select unique2,1 from tenk1 b); is a cartesian product computation, which is really not acceptable. This is a regression in CVS HEAD compared to previous releases, which were able to notice the impossible join condition in this case --- though not in some related cases that are also improved by this patch, such as select * from tenk1 a left join tenk1 b on (a.unique1=b.unique2 and 0=1); Fix by skipping evaluation of the appropriate side of the outer join in cases where it's demonstrably unnecessary.	2008-08-17 19:40:11 +00:00
Tom Lane	f2689e421d	Remove prohibition against SubLinks in the WHERE clause of an EXISTS subquery that we're considering pulling up. I hadn't wanted to think through whether that could work during the first pass at this stuff. However, on closer inspection it seems to be safe enough.	2008-08-17 02:19:19 +00:00
Tom Lane	19e34b6239	Improve sublink pullup code to handle ANY/EXISTS sublinks that are at top level of a JOIN/ON clause, not only at top level of WHERE. (However, we can't do this in an outer join's ON clause, unless the ANY/EXISTS refers only to the nullable side of the outer join, so that it can effectively be pushed down into the nullable side.) Per request from Kevin Grittner. In passing, fix a bug in the initial implementation of EXISTS pullup: it would Assert if the EXIST's WHERE clause used a join alias variable. Since we haven't yet flattened join aliases when this transformation happens, it's necessary to include join relids in the computed set of RHS relids.	2008-08-17 01:20:00 +00:00
Tom Lane	d4af2a6481	Clean up the loose ends in selectivity estimation left by my patch for semi and anti joins. To do this, pass the SpecialJoinInfo struct for the current join as an additional optional argument to operator join selectivity estimation functions. This allows the estimator to tell not only what kind of join is being formed, but which variable is on which side of the join; a requirement long recognized but not dealt with till now. This also leaves the door open for future improvements in the estimators, such as accounting for the null-insertion effects of lower outer joins. I didn't do anything about that in the current patch but the information is in principle deducible from what's passed. The patch also clarifies the definition of join selectivity for semi/anti joins: it's the fraction of the left input that has (at least one) match in the right input. This allows getting rid of some very fuzzy thinking that I had committed in the original 7.4-era IN-optimization patch. There's probably room to estimate this better than the present patch does, but at least we know what to estimate. Since I had to touch CREATE OPERATOR anyway to allow a variant signature for join estimator functions, I took the opportunity to add a couple of additional checks that were missing, per my recent message to -hackers: * Check that estimator functions return float8; * Require execute permission at the time of CREATE OPERATOR on the operator's function as well as the estimator functions; * Require ownership of any pre-existing operator that's modified by the command. I also moved the lookup of the functions out of OperatorCreate() and into operatorcmds.c, since that seemed more consistent with most of the other catalog object creation processes, eg CREATE TYPE.	2008-08-16 00:01:38 +00:00
Tom Lane	118461114e	Performance fix for new anti-join code in nodeMergejoin.c: after finding a match in antijoin mode, we should advance to next outer tuple not next inner. We know we don't want to return this outer tuple, and there is no point in advancing over matching inner tuples now, because we'd just have to do it again if the next outer tuple has the same merge key. This makes a noticeable difference if there are lots of duplicate keys in both inputs. Similarly, after finding a match in semijoin mode, arrange to advance to the next outer tuple after returning the current match; or immediately, if it fails the extra quals. The rationale is the same. (This is a performance bug in existing releases; perhaps worth back-patching? The planner tries to avoid using mergejoin with lots of duplicates, so it may not be a big issue in practice.) Nestloop and hash got this right to start with, but I made some cosmetic adjustments there to make the corresponding bits of logic look more similar.	2008-08-15 19:20:42 +00:00
Magnus Hagander	5b8eb2b4b9	Make the temporary directory for pgstat files configurable by the GUC variable stats_temp_directory, instead of requiring the admin to mount/symlink the pg_stat_tmp directory manually. For now the config variable is PGC_POSTMASTER. Room for further improvment that would allow it to be changed on-the-fly.	2008-08-15 08:37:41 +00:00

1 2 3 4 5 ...

9876 Commits