postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-10-05 09:36:54 +02:00

Author	SHA1	Message	Date
Tom Lane	0ee5a39862	Apply a band-aid fix for the problem that 8.2 and up completely misestimate the number of rows likely to be produced by a query such as SELECT * FROM t1 LEFT JOIN t2 USING (key) WHERE t2.key IS NULL; What this is doing is selecting for t1 rows with no match in t2, and thus it may produce a significant number of rows even if the t2.key table column contains no nulls at all. 8.2 thinks the table column's null fraction is relevant and thus may estimate no rows out, which results in terrible plans if there are more joins above this one. A proper fix for this will involve passing much more information about the context of a clause to the selectivity estimator functions than we ever have. There's no time left to write such a patch for 8.3, and it wouldn't be back-patchable into 8.2 anyway. Instead, put in an ad-hoc test to defeat the normal table-stats-based estimation when an IS NULL test is evaluated at an outer join, and just use a constant estimate instead --- I went with 0.5 for lack of a better idea. This won't catch every case but it will catch the typical ways of writing such queries, and it seems unlikely to make things worse for other queries.	2007-08-31 23:35:22 +00:00
Tom Lane	140d4ebcb4	Tsearch2 functionality migrates to core. The bulk of this work is by Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing, so anything that's broken is probably my fault. Documentation is nonexistent as yet, but let's land the patch so we can get some portability testing done.	2007-08-21 01:11:32 +00:00
Magnus Hagander	343a9a27a9	Check return code from strxfrm on Windows since it has a non-standard way of indicating errors, so we don't try to allocate INT_MAX bytes to store a result in.	2007-05-05 17:05:48 +00:00
Tom Lane	afcf09dd90	Some further performance tweaks for planning large inheritance trees that are mostly excluded by constraints: do the CE test a bit earlier to save some adjust_appendrel_attrs() work on excluded children, and arrange to use array indexing rather than rt_fetch() to fetch RTEs in the main body of the planner. The latter is something I'd wanted to do for awhile anyway, but seeing list_nth_cell() as 35% of the runtime gets one's attention.	2007-04-21 21:01:45 +00:00
Tom Lane	f02a82b6ad	Make 'col IS NULL' clauses be indexable conditions. Teodor Sigaev, with some kibitzing from Tom Lane.	2007-04-06 22:33:43 +00:00
Tom Lane	bf94076348	Fix array coercion expressions to ensure that the correct volatility is seen by code inspecting the expression. The best way to do this seems to be to drop the original representation as a function invocation, and instead make a special expression node type that represents applying the element-type coercion function to each array element. In this way the element function is exposed and will be checked for volatility. Per report from Guillaume Smet.	2007-03-27 23:21:12 +00:00
Tom Lane	54d20024c1	Fix some problems with selectivity estimation for partial indexes. First, genericcostestimate() was being way too liberal about including partial-index conditions in its selectivity estimate, resulting in substantial underestimates for situations such as an indexqual "x = 42" used with an index on x "WHERE x >= 40 AND x < 50". While the code is intentionally set up to favor selecting partial indexes when available, this was too much... Second, choose_bitmap_and() was likewise easily fooled by cases of this type, since it would similarly think that the partial index had selectivity independent of the indexqual. Fixed by using predicate_implied_by() rather than simple equality checks to determine redundancy. This is a good deal more expensive but I don't see much alternative. At least the extra cost is only paid when there's actually a partial index under consideration. Per report from Jeff Davis. I'm not going to risk back-patching this, though.	2007-03-21 22:18:12 +00:00
Tom Lane	0f4ff460c4	Fix up the remaining places where the expression node structure would lose available information about the typmod of an expression; namely, Const, ArrayRef, ArrayExpr, and EXPR and ARRAY SubLinks. In the ArrayExpr and SubLink cases it wasn't really the data structure's fault, but exprTypmod() being lazy. This seems like a good idea in view of the expected increase in typmod usage from Teodor's work to allow user-defined types to have typmods. In particular this responds to the concerns we had about eliminating the special-purpose hack that exprTypmod() used to have for BPCHAR Consts. We can now tell whether or not such a Const has been cast to a specific length, and report or display properly if so. initdb forced due to changes in stored rules.	2007-03-17 00:11:05 +00:00
Tom Lane	234a02b2a8	Replace direct assignments to VARATT_SIZEP(x) with SET_VARSIZE(x, len). Get rid of VARATT_SIZE and VARATT_DATA, which were simply redundant with VARSIZE and VARDATA, and as a consequence almost no code was using the longer names. Rename the length fields of struct varlena and various derived structures to catch anyplace that was accessing them directly; and clean up various places so caught. In itself this patch doesn't change any behavior at all, but it is necessary infrastructure if we hope to play any games with the representation of varlena headers. Greg Stark and Tom Lane	2007-02-27 23:48:10 +00:00
Tom Lane	eab6b8b27e	Turn the rangetable used by the executor into a flat list, and avoid storing useless substructure for its RangeTblEntry nodes. (I chose to keep using the same struct node type and just zero out the link fields for unneeded info, rather than making a separate ExecRangeTblEntry type --- it seemed too fragile to have two different rangetable representations.) Along the way, put subplans into a list in the toplevel PlannedStmt node, and have SubPlan nodes refer to them by list index instead of direct pointers. Vadim wanted to do that years ago, but I never understood what he was on about until now. It makes things a whole lot more robust, because we can stop worrying about duplicate processing of subplans during expression tree traversals. That's been a constant source of bugs, and it's finally gone. There are some consequent simplifications yet to be made, like not using a separate EState for subplans in the executor, but I'll tackle that later.	2007-02-22 22:00:26 +00:00
Tom Lane	7c5e5439d2	Get rid of some old and crufty global variables in the planner. When this code was last gone over, there wasn't really any alternative to globals because we didn't have the PlannerInfo struct being passed all through the planner code. Now that we do, we can restructure things to avoid non-reentrancy. I'm fooling with this because otherwise I'd have had to add another global variable for the planned compact range table list.	2007-02-19 07:03:34 +00:00
Teodor Sigaev	61f621b506	Revert gincostestimate changes.	2007-01-31 16:54:51 +00:00
Teodor Sigaev	d4c6da1527	Allow GIN's extractQuery method to signal that nothing can satisfy the query. In this case extractQuery should returns -1 as nentries. This changes prototype of extractQuery method to use int32* instead of uint32* for nentries argument. Based on that gincostestimate may see two corner cases: nothing will be found or seqscan should be used. Per proposal at http://archives.postgresql.org/pgsql-hackers/2007-01/msg01581.php PS tsearch_core patch should be sightly modified to support changes, but I'm waiting a verdict about reviewing of tsearch_core patch.	2007-01-31 15:09:45 +00:00
Tom Lane	a053437d9e	Dept of second thoughts: the IQ of estimate_array_length() needs to be kept on par with that of scalararraysel(), else estimates that should track might not. Hence teach it about binary-compatible cases, too.	2007-01-28 02:53:34 +00:00
Tom Lane	af18f6ad85	Fix scalararraysel() to cope with binary-compatible cases, such as text[] versus varchar[]. This oversight probably explains Ryan Holmes' recent complaint --- he was getting a generic selectivity estimate instead of anything intelligent.	2007-01-28 01:37:38 +00:00
Tom Lane	4f06c688c7	Put back planner's ability to cache the results of mergejoinscansel(), which I had removed in the first cut of the EquivalenceClass rewrite to simplify that patch a little. But it's still important --- in a four-way join problem mergejoinscansel() was eating about 40% of the planning time according to gprof. Also, improve the EquivalenceClass code to re-use join RestrictInfos rather than generating fresh ones for each join considered. This saves some memory space but more importantly improves the effectiveness of caching planning info in RestrictInfos.	2007-01-22 20:00:40 +00:00
Tom Lane	f41803bb39	Refactor planner's pathkeys data structure to create a separate, explicit representation of equivalence classes of variables. This is an extensive rewrite, but it brings a number of benefits: * planner no longer fails in the presence of "incomplete" operator families that don't offer operators for every possible combination of datatypes. * avoid generating and then discarding redundant equality clauses. * remove bogus assumption that derived equalities always use operators named "=". * mergejoins can work with a variety of sort orders (e.g., descending) now, instead of tying each mergejoinable operator to exactly one sort order. * better recognition of redundant sort columns. * can make use of equalities appearing underneath an outer join.	2007-01-20 20:45:41 +00:00
Tom Lane	4431758229	Support ORDER BY ... NULLS FIRST/LAST, and add ASC/DESC/NULLS FIRST/NULLS LAST per-column options for btree indexes. The planner's support for this is still pretty rudimentary; it does not yet know how to plan mergejoins with nondefault ordering options. The documentation is pretty rudimentary, too. I'll work on improving that stuff later. Note incompatible change from prior behavior: ORDER BY ... USING will now be rejected if the operator is not a less-than or greater-than member of some btree opclass. This prevents less-than-sane behavior if an operator that doesn't actually define a proper sort ordering is selected.	2007-01-09 02:14:16 +00:00
Bruce Momjian	29dccf5fe0	Update CVS HEAD for 2007 copyright. Back branches are typically not back-stamped for this.	2007-01-05 22:20:05 +00:00
Tom Lane	d6061d2f31	Fix regex_fixed_prefix() to cope reasonably well with regex patterns of the form '^(foo)$'. Before, these could never be optimized into indexscans. The recent changes to make psql and pg_dump generate such patterns (for \d commands and -t and related switches, respectively) therefore represented a big performance hit for people with large pg_class catalogs, as seen in recent gripe from Erik Jones. While at it, be more paranoid about case-sensitivity checking in multibyte encodings, and fix some other corner cases in which a regex might be interpreted too liberally.	2007-01-03 22:39:26 +00:00
Tom Lane	a78fcfb512	Restructure operator classes to allow improved handling of cross-data-type cases. Operator classes now exist within "operator families". While most families are equivalent to a single class, related classes can be grouped into one family to represent the fact that they are semantically compatible. Cross-type operators are now naturally adjunct parts of a family, without having to wedge them into a particular opclass as we had done originally. This commit restructures the catalogs and cleans up enough of the fallout so that everything still works at least as well as before, but most of the work needed to actually improve the planner's behavior will come later. Also, there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way to create a new family right now is to allow CREATE OPERATOR CLASS to make one by default. I owe some more documentation work, too. But that can all be done in smaller pieces once this infrastructure is in place.	2006-12-23 00:43:13 +00:00
Tom Lane	281f40187f	Fix some planner bugs exposed by reports from Arjen van der Meijden. These are all in new-in-8.2 logic associated with indexability of ScalarArrayOpExpr (IN-clauses) or amortization of indexscan costs across repeated indexscans on the inside of a nestloop. In particular: Fix some logic errors in the estimation for multiple scans induced by a ScalarArrayOpExpr indexqual. Include a small cost component in bitmap index scans to reflect the costs of manipulating the bitmap itself; this is mainly to prevent a bitmap scan from appearing to have the same cost as a plain indexscan for fetching a single tuple. Also add a per-index-scan-startup CPU cost component; while prior releases were clearly too pessimistic about the cost of repeated indexscans, the original 8.2 coding allowed the cost of an indexscan to effectively go to zero if repeated often enough, which is overly optimistic. Pay some attention to index correlation when estimating costs for a nestloop inner indexscan: this is significant when the plan fetches multiple heap tuples per iteration, since high correlation means those tuples are probably on the same or adjacent heap pages.	2006-12-15 18:42:26 +00:00
Bruce Momjian	f99a569a2e	pgindent run for 8.2.	2006-10-04 00:30:14 +00:00
Tom Lane	bfd1ffa948	Change patternsel (LIKE/regex selectivity estimation) so that if there is a large enough histogram, it will use the number of matches in the histogram to derive a selectivity estimate, rather than the admittedly pretty bogus heuristics involving examining the pattern contents. I set 'large enough' at 100, but perhaps we should change that later. Also apply the same technique in contrib/ltree's <@ and @> estimator. Per discussion with Stefan Kaltenbrunner and Matteo Beccati.	2006-09-20 19:50:21 +00:00
Tom Lane	b74c543685	Improve usage of effective_cache_size parameter by assuming that all the tables in the query compete for cache space, not just the one we are currently costing an indexscan for. This seems more realistic, and it definitely will help in examples recently exhibited by Stefan Kaltenbrunner. To get the total size of all the tables involved, we must tweak the handling of 'append relations' a bit --- formerly we looked up information about the child tables on-the-fly during set_append_rel_pathlist, but it needs to be done before we start doing any cost estimation, so push it into the add_base_rels_to_query scan.	2006-09-19 22:49:53 +00:00
Bruce Momjian	9a7483714f	Work around bug in strxfmt() but in MS VS2005. William ZHANG	2006-07-26 17:17:28 +00:00
Tom Lane	8dcaea7be0	Add a fudge factor to genericcostestimate() to prevent the planner from thinking that indexes of different sizes are equally attractive. Per gripe from Jim Nasby. (I remain unconvinced that there's such a problem in existing releases, but CVS HEAD definitely has got a problem because of its new count-only-leaf-pages approach to indexscan costing.)	2006-07-24 01:19:48 +00:00
Bruce Momjian	e0522505bd	Remove 576 references of include files that were not needed.	2006-07-14 14:52:27 +00:00
Tom Lane	08ccdf020e	Fix oversight in planning for multiple indexscans driven by ScalarArrayOpExpr index quals: we were estimating the right total number of rows returned, but treating the index-access part of the cost as if a single scan were fetching that many consecutive index tuples. Actually we should treat it as a multiple indexscan, and if there are enough of 'em the Mackert-Lohman discount should kick in.	2006-07-01 22:07:23 +00:00
Tom Lane	8a30cc2127	Make the planner estimate costs for nestloop inner indexscans on the basis that the Mackert-Lohmann formula applies across all the repetitions of the nestloop, not just each scan independently. We use the M-L formula to estimate the number of pages fetched from the index as well as from the table; that isn't what it was designed for, but it seems reasonably applicable anyway. This makes large numbers of repetitions look much cheaper than before, which accords with many reports we've received of overestimation of the cost of a nestloop. Also, change the index access cost model to charge random_page_cost per index leaf page touched, while explicitly not counting anything for access to metapage or upper tree pages. This may all need tweaking after we get some field experience, but in simple tests it seems to be giving saner results than before. The main thing is to get the infrastructure in place to let cost_index() and amcostestimate functions take repeated scans into account at all. Per my recent proposal. Note: this patch changes pg_proc.h, but I did not force initdb because the changes are basically cosmetic --- the system does not look into pg_proc to decide how to call an index amcostestimate function, and there's no way to call such a function from SQL at all.	2006-06-06 17:59:58 +00:00
Tom Lane	eed6c9ed7e	Add a GUC parameter seq_page_cost, and use that everywhere we formerly assumed that a sequential page fetch has cost 1.0. This patch doesn't in itself change the system's behavior at all, but it opens the door to people adopting other units of measurement for EXPLAIN costs. Also, if we ever decide it's worth inventing per-tablespace access cost settings, this change provides a workable intellectual framework for that.	2006-06-05 02:49:58 +00:00
Teodor Sigaev	8a3631f8d8	GIN: Generalized Inverted iNdex. text[], int4[], Tsearch2 support for GIN.	2006-05-02 11:28:56 +00:00
Tom Lane	427c6b5b98	Avoid assuming that statistics for a parent relation reflect the properties of the union of its child relations as well. This might have been a good idea when it was originally coded, but it's a fatally bad idea when inheritance is being used for partitioning. It's better to have no stats at all than completely misleading stats. Per report from Mark Liberman. The bug arguably exists all the way back, but I've only patched HEAD and 8.1 because we weren't particularly trying to support partitioning before 8.1. Eventually we ought to look at deriving union statistics instead of just punting, but for now the drop kick looks good.	2006-05-02 04:34:18 +00:00
Tom Lane	0f0a33099c	Generalize mcv_selectivity() to support both VAR OP CONST and CONST OP VAR cases. This was not needed in the existing uses within selfuncs.c, but if we're gonna export it for general use, the extra generality seems helpful. Motivated by looking at ltree example.	2006-04-27 17:52:40 +00:00
Tom Lane	a3c1a11fc1	If we're going to expose VariableStatData for contrib modules to use, then we should export a reasonable set of the supporting routines too.	2006-04-27 00:46:59 +00:00
Bruce Momjian	59d61409cd	Move ltree parentsel() selectivity function into /contrib/ltree.	2006-04-26 22:33:36 +00:00
Bruce Momjian	b3e4aefcfb	Enhanced containment selectivity function for /contrib/ltree Matteo Beccati	2006-04-26 18:28:34 +00:00
Tom Lane	efe222268f	Eliminate some no-longer-needed workarounds for palloc's old behavior of rejecting palloc(0). Also, tweak like_selectivity() to avoid assuming the presented pattern is nonempty; although that assumption is valid, it doesn't really help much, and the new coding is more correct anyway since it properly handles redundant wildcards. In combination these changes should eliminate a Coverity warning noted by Martijn.	2006-04-20 17:50:18 +00:00
Bruce Momjian	f2f5b05655	Update copyright for 2006. Update scripts.	2006-03-05 15:59:11 +00:00
Tom Lane	3a0a16cb7e	Allow row comparisons to be used as indexscan qualifications. This completes the project to upgrade our handling of row comparisons.	2006-01-25 20:29:24 +00:00
Tom Lane	34f8ee9737	Add selectivity-calculation code for RowCompareExpr nodes. Simplistic, but a lot better than nothing at all ...	2006-01-14 00:14:12 +00:00
Tom Lane	ce8fd39e15	Improve patternsel() by applying the operator itself to each value listed in the column's most-common-values statistics entry. This gives us an exact selectivity result for the portion of the column population represented by the MCV list, which can be a big leg up in accuracy if that's a large fraction of the population. The heuristics involving pattern contents and prefix are applied only to the part of the population not included in the MCV list.	2006-01-10 17:35:52 +00:00
Tom Lane	290166f934	Teach planner and executor to handle ScalarArrayOpExpr as an indexable qualification when the underlying operator is indexable and useOr is true. That is, indexkey op ANY (ARRAY[...]) is effectively translated into an OR combination of one indexscan for each array element. This only works for bitmap index scans, of course, since regular indexscans no longer support OR'ing of scans. There are still some loose ends to clean up before changing 'x IN (list)' to translate as a ScalarArrayOpExpr; for instance predtest.c ought to be taught about it. But this gets the basic functionality in place.	2005-11-25 19:47:50 +00:00
Bruce Momjian	436a2956d8	Re-run pgindent, fixing a problem where comment lines after a blank comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.	2005-11-22 18:17:34 +00:00
Tom Lane	2a8d3d83ef	R-tree is dead ... long live GiST.	2005-11-07 17:36:47 +00:00
Bruce Momjian	1dc3498251	Standard pgindent run for 8.1.	2005-10-15 02:49:52 +00:00
Tom Lane	0cc0d0822d	Document that get_attstatsslot/free_attstatsslot only need to be passed valid type information if they are asked to fetch the values part of a pg_statistic slot; these arguments are unneeded if fetching only the numbers part. Use this to save a catcache lookup in btcostestimate, which is looking like a bit of a hotspot in recent profiling. Not a big savings, but since it's essentially free, might as well do it.	2005-10-11 17:27:14 +00:00
Tom Lane	303e089df5	Clean up possibly-uninitialized-variable warnings reported by gcc 4.x.	2005-09-24 22:54:44 +00:00
Tom Lane	8889685555	Suppress signed-vs-unsigned-char warnings.	2005-09-24 17:53:28 +00:00
Bruce Momjian	9dbd00b0e2	Remove unnecessary parentheses in assignments. Add spaces where needed. Reference time interval variables as tinterval.	2005-07-21 04:41:43 +00:00
Bruce Momjian	a536b2dd80	Add time/date macros for code clarity: #define DAYS_PER_YEAR 365.25 #define MONTHS_PER_YEAR 12 #define DAYS_PER_MONTH 30 #define HOURS_PER_DAY 24	2005-07-21 03:56:25 +00:00
Bruce Momjian	db05f4a7eb	Add 'day' field to INTERVAL so 1 day interval can be distinguished from 24 hours. This is very helpful for daylight savings time: select '2005-05-03 00:00:00 EST'::timestamp with time zone + '24 hours'; ?column? ---------------------- 2005-05-04 01:00:00-04 select '2005-05-03 00:00:00 EST'::timestamp with time zone + '1 day'; ?column? ---------------------- 2005-05-04 01:00:00-04 Michael Glaesemann	2005-07-20 16:42:32 +00:00
Bruce Momjian	7f0b690334	Improve comments for AdjustIntervalForTypmod. Blank line adjustments.	2005-07-12 16:05:12 +00:00
Tom Lane	b5f7cff84f	Clean up the rather historically encumbered interface to now() and current time: provide a GetCurrentTimestamp() function that returns current time in the form of a TimestampTz, instead of separate time_t and microseconds fields. This is what all the callers really want anyway, and it eliminates low-level dependencies on AbsoluteTime, which is a deprecated datatype that will have to disappear eventually.	2005-06-29 22:51:57 +00:00
Tom Lane	c186c93148	Change the planner to allow indexscan qualification clauses to use nonconsecutive columns of a multicolumn index, as per discussion around mid-May (pghackers thread "Best way to scan on-disk bitmaps"). This turns out to require only minimal changes in btree, and so far as I can see none at all in GiST. btcostestimate did need some work, but its original assumption that index selectivity == heap selectivity was quite bogus even before this.	2005-06-13 23:14:49 +00:00
Tom Lane	2f1210629c	Separate predicate-testing code out of indxpath.c, making it a module in its own right. As proposed by Simon Riggs, but with some editorializing of my own.	2005-06-10 22:25:37 +00:00
Tom Lane	9ab4d98168	Remove planner's private fields from Query struct, and put them into a new PlannerInfo struct, which is passed around instead of the bare Query in all the planning code. This commit is essentially just a code-beautification exercise, but it does open the door to making larger changes to the planner data structures without having to muck with the widely-known Query struct.	2005-06-05 22:32:58 +00:00
Tom Lane	1e85fa2008	patternsel() was improperly stripping RelabelType from the derived expressions it constructed, causing scalarineqsel to become confused if the underlying variable was of a domain type. Per report from Kevin Grittner.	2005-06-01 17:05:11 +00:00
Tom Lane	5b05185262	Remove support for OR'd indexscans internal to a single IndexScan plan node, as this behavior is now better done as a bitmap OR indexscan. This allows considerable simplification in nodeIndexscan.c itself as well as several planner modules concerned with indexscan plan generation. Also we can improve the sharing of code between regular and bitmap indexscans, since they are now working with nigh-identical Plan nodes.	2005-04-25 01:30:14 +00:00
Tom Lane	162bd08b3f	Completion of project to use fixed OIDs for all system catalogs and indexes. Replace all heap_openr and index_openr calls by heap_open and index_open. Remove runtime lookups of catalog OID numbers in various places. Remove relcache's support for looking up system catalogs by name. Bulky but mostly very boring patch ...	2005-04-14 20:03:27 +00:00
Tom Lane	a5dda5dc3a	Second try at making examine_variable and friends behave sanely in cases with binary-compatible relabeling. My first try was implicitly assuming that all operators scalarineqsel is used for have binary- compatible datatypes on both sides ... which is very wrong of course. Per report from Michael Fuhr.	2005-04-01 20:31:50 +00:00
Tom Lane	bf3dbb5881	First steps towards index scans with heap access decoupled from index access: define new index access method functions 'amgetmulti' that can fetch multiple TIDs per call. (The functions exist but are totally untested as yet.) Since I was modifying pg_am anyway, remove the no-longer-needed 'rel' parameter from amcostestimate functions, and also remove the vestigial amowner column that was creating useless work for Alvaro's shared-object-dependencies project. Initdb forced due to changes in pg_am.	2005-03-27 23:53:05 +00:00
Tom Lane	9d388e1f39	Fix a pair of related issues with estimation of inequalities that involve binary-compatible relabeling of one or both operands. examine_variable should avoid stripping RelabelType from non-variable expressions, so that they will continue to have the correct type; and convert_to_scalar should just use that type and ignore the other input type. This isn't perfect but it beats failing entirely. Per example from Michael Fuhr.	2005-03-26 20:55:39 +00:00
Bruce Momjian	e3d7de6b99	Rename canonical encodings, per Peter: UNICODE => UTF8 ALT => WIN866 WIN => WIN1251 TCVN => WIN1258 The old codes continue to work.	2005-03-07 04:30:55 +00:00
Tom Lane	849074f9ae	Revise hash join code so that we can increase the number of batches on-the-fly, and thereby avoid blowing out memory when the planner has underestimated the hash table size. Hash join will now obey the work_mem limit with some faithfulness. Per my recent proposal (hash aggregate part isn't done yet though).	2005-03-06 22:15:05 +00:00
Tom Lane	a3f945a1b2	Adjust estimate_num_groups() to not clamp per-relation group count estimate to less than the number of values estimated for any one grouping Var, as suggested by Manfred. This is intuitively right, and what's more it puts the plan choices in the subselect regression test back the way they were before ...	2005-02-01 23:08:13 +00:00
Tom Lane	875b0c62fa	When dealing with multiple grouping columns coming from the same table, clamp the estimated number of groups to table row count over 10, instead of table row count; this reflects a heuristic that people probably won't group over a near-unique set of columns, and the knowledge that we don't currently have any way to estimate the correlation of the columns better than guessing. This change creates a trivial plan change in one of the regression tests.	2005-01-28 20:34:27 +00:00
PostgreSQL Daemon	2ff501590b	Tag appropriate files for rc3 Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...	2004-12-31 22:04:05 +00:00
Tom Lane	76e8a87f15	Teach regex_fixed_prefix() the correct handling of advanced regex escapes --- they aren't simply quoted characters. Problem noted by Antti Salmela. Also fix problem with incorrect handling of multibyte characters when followed by a quantifier.	2004-12-02 02:45:07 +00:00
Tom Lane	547bb4a7f2	Use a hopefully-more-reliable method of detecting default selectivity estimates when combining the estimates for a range query. As pointed out by Miquel van Smoorenburg, the existing check for an impossible combined result would quite possibly fail to detect one default and one non-default input. It seems better to use the default range query estimate in such cases. To do so, add a check for an estimate of exactly DEFAULT_INEQ_SEL. This is a bit ugly because it introduces additional coupling between clauselist_selectivity and scalarltsel/scalargtsel, but it's not like there wasn't plenty already...	2004-11-09 00:34:46 +00:00
Tom Lane	84c7cef5eb	Fix estimate_num_groups to be able to use expression-index statistics when there is an expressional index matching a GROUP BY item.	2004-09-18 19:39:50 +00:00
Bruce Momjian	15d3f9f6b7	Another pgindent run with lib typedefs added.	2004-08-30 02:54:42 +00:00
Bruce Momjian	b6b71b85bc	Pgindent run for 8.0.	2004-08-29 05:07:03 +00:00
Bruce Momjian	da9a8649d8	Update copyright to 2004.	2004-08-29 04:13:13 +00:00
Tom Lane	fcbc438727	Label CVS tip as 8.0devel instead of 7.5devel. Adjust various comments and documentation to reference 8.0 instead of 7.5.	2004-08-04 21:34:35 +00:00
Tom Lane	7643bed58e	When using extended-query protocol, postpone planning of unnamed statements until Bind is received, so that actual parameter values are visible to the planner. Make use of the parameter values for estimation purposes (but don't fold them into the actual plan). This buys back most of the potential loss of plan quality that ensues from using out-of-line parameters instead of putting literal values right into the query text. This patch creates a notion of constant-folding expressions 'for estimation purposes only', in which case we can be more aggressive than the normal eval_const_expressions() logic can be. Right now the only difference in behavior is inserting bound values for Params, but it will be interesting to look at other possibilities. One that we've seen come up repeatedly is reducing now() and related functions to current values, so that queries like ... WHERE timestampcol > now() - '1 day' have some chance of being planned effectively. Oliver Jowett, with some kibitzing from Tom Lane.	2004-06-11 01:09:22 +00:00
Neil Conway	72b6ad6313	Use the new List API function names throughout the backend, and disable the list compatibility API by default. While doing this, I decided to keep the llast() macro around and introduce llast_int() and llast_oid() variants.	2004-05-30 23:40:41 +00:00
Neil Conway	d0b4399d81	Reimplement the linked list data structure used throughout the backend. In the past, we used a 'Lispy' linked list implementation: a "list" was merely a pointer to the head node of the list. The problem with that design is that it makes lappend() and length() linear time. This patch fixes that problem (and others) by maintaining a count of the list length and a pointer to the tail node along with each head node pointer. A "list" is now a pointer to a structure containing some meta-data about the list; the head and tail pointers in that structure refer to ListCell structures that maintain the actual linked list of nodes. The function names of the list API have also been changed to, I hope, be more logically consistent. By default, the old function names are still available; they will be disabled-by-default once the rest of the tree has been updated to use the new API names.	2004-05-26 04:41:50 +00:00
Tom Lane	df79b847fe	genericcostestimate() neglected to include qual startup cost in indexTotalCost. I think this may not make any real difference in 7.4, but it definitely is a problem with CVS tip's new equation.	2004-02-27 21:44:34 +00:00
Tom Lane	a536ed53bc	Make use of statistics on index expressions. There are still some corner cases that could stand improvement, but it does all the basic stuff. A byproduct is that the selectivity routines are no longer constrained to working on simple Vars; we might in future be able to improve the behavior for subexpressions that don't match indexes.	2004-02-17 00:52:53 +00:00
Tom Lane	9fe097577e	Avoid generating invalid character encoding sequences in make_greater_string. Not sure how this mistake evaded detection for so long.	2004-02-02 03:07:08 +00:00
Tom Lane	de816a03c4	Repair misestimation of indexscan CPU costs. When an indexqual contains a run-time key (that is, a nonconstant expression compared to the index variable), the key is evaluated just once per scan, but we were charging costs as though it were evaluated once per visited index entry.	2004-01-17 20:09:35 +00:00
Neil Conway	192ad63bd7	More janitorial work: remove the explicit casting of NULL literals to a pointer type when it is not necessary to do so. For future reference, casting NULL to a pointer type is only necessary when (a) invoking a function AND either (b) the function has no prototype OR (c) the function is a varargs function.	2004-01-07 18:56:30 +00:00
Tom Lane	fa559a86ee	Adjust indexscan planning logic to keep RestrictInfo nodes associated with index qual clauses in the Path representation. This saves a little work during createplan and (probably more importantly) allows reuse of cached selectivity estimates during indexscan planning. Also fix latent bug: wrong plan would have been generated for a 'special operator' used in a nestloop-inner-indexscan join qual, because the special operator would not have gotten into the list of quals to recheck. This bug is only latent because at present the special-operator code could never trigger on a join qual, but sooner or later someone will want to do it.	2004-01-05 23:39:54 +00:00
Tom Lane	5c5b911fcc	Using canonicalize_qual() to get rid of duplicate index predicate conditions is overkill; set_union() does the job about as well, and much more efficiently. Furthermore this avoids assuming that canonicalize_qual() will check for duplicate clauses at all, which it may not always do.	2003-12-29 22:22:45 +00:00
Tom Lane	c607bd693f	Clean up the usage of canonicalize_qual(): in particular, be consistent about whether it is applied before or after eval_const_expressions(). I believe there were some corner cases where the system would fail to recognize that a partial index is applicable because of the previous inconsistency. Store normal rather than 'implicit AND' representations of constraints and index predicates in the catalogs. initdb forced due to representation change of constraints/predicates.	2003-12-28 21:57:37 +00:00
Joe Conway	53e7c1363a	Repair indexed bytea like operations, and related selectivity functionality. Per bug report by Alvar Freude: http://archives.postgresql.org/pgsql-bugs/2003-12/msg00022.php	2003-12-07 04:14:10 +00:00
PostgreSQL Daemon	969685ad44	$Header: -> $PostgreSQL Changes ...	2003-11-29 19:52:15 +00:00
Tom Lane	fa5c8a055a	Cross-data-type comparisons are now indexable by btrees, pursuant to my pghackers proposal of 8-Nov. All the existing cross-type comparison operators (int2/int4/int8 and float4/float8) have appropriate support. The original proposal of storing the right-hand-side datatype as part of the primary key for pg_amop and pg_amproc got modified a bit in the event; it is easier to store zero as the 'default' case and only store a nonzero when the operator is actually cross-type. Along the way, remove the long-since-defunct bigbox_ops operator class.	2003-11-12 21:15:59 +00:00
Tom Lane	64c1fc7257	Avoid division by zero in estimate_num_groups() when table has no rows.	2003-10-16 21:37:54 +00:00
Peter Eisentraut	feb4f44d29	Message editing: remove gratuitous variations in message wording, standardize terms, add some clarifications, fix some untranslatable attempts at dynamic message building.	2003-09-25 06:58:07 +00:00
Bruce Momjian	46785776c4	Another pgindent run with updated typedefs.	2003-08-08 21:42:59 +00:00
Bruce Momjian	f3c3deb7d0	Update copyrights to 2003.	2003-08-04 02:40:20 +00:00
Bruce Momjian	089003fb46	pgindent run.	2003-08-04 00:43:34 +00:00
Tom Lane	b6a1d25b0a	Error message editing in utils/adt. Again thanks to Joe Conway for doing the bulk of the heavy lifting ...	2003-07-27 04:53:12 +00:00
Tom Lane	0347d310d7	Oh, for crying in a bucket ... relax Assert so that glibc's strxfrm does not dump core.	2003-07-17 22:20:14 +00:00
Tom Lane	59d9a37080	Work around buggy strxfrm() present in some Solaris releases.	2003-07-17 20:52:36 +00:00
Tom Lane	fc8d970cbc	Replace functional-index facility with expressional indexes. Any column of an index can now be a computed expression instead of a simple variable. Restrictions on expressions are the same as for predicates (only immutable functions, no sub-selects). This fixes problems recently introduced with inlining SQL functions, because the inlining transformation is applied to both expression trees so the planner can still match them up. Along the way, improve efficiency of handling index predicates (both predicates and index expressions are now cached by the relcache) and fix 7.3 oversight that didn't record dependencies of predicate expressions.	2003-05-28 16:04:02 +00:00
Tom Lane	f45df8c014	Cause CHAR(n) to TEXT or VARCHAR conversion to automatically strip trailing blanks, in hopes of reducing the surprise factor for newbies. Remove redundant operators for VARCHAR (it depends wholly on TEXT operations now). Clean up resolution of ambiguous operators/functions to avoid surprising choices for domains: domains are treated as equivalent to their base types and binary-coercibility is no longer considered a preference item when choosing among multiple operators/functions. IsBinaryCoercible now correctly reflects the notion that you need only relabel the type to get from type A to type B: that is, a domain is binary-coercible to its base type, but not vice versa. Various marginal cleanup, including merging the essentially duplicate resolution code in parse_func.c and parse_oper.c. Improve opr_sanity regression test to understand about binary compatibility (using pg_cast), and fix a couple of small errors in the catalogs revealed thereby. Restructure "special operator" handling to fetch operators via index opclasses rather than hardwiring assumptions about names (cleans up the pattern_ops stuff a little).	2003-05-26 00:11:29 +00:00
Peter Eisentraut	2c0556068f	Indexing support for pattern matching operations via separate operator class when lc_collate is not C.	2003-05-15 15:50:21 +00:00

1 2 3 4 5 ...

286 Commits