postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	6980f817e8	Stop btree indexscans upon reaching nulls in either direction. The existing scan-direction-sensitive tests were overly complex, and failed to stop the scan in cases where it's perfectly legitimate to do so. Per bug #6278 from Maksym Boguk. Back-patch to 8.3, which is as far back as the patch applies easily. Doesn't seem worth sweating over a relatively minor performance issue in 8.2 at this late date. (But note that this was a performance regression from 8.1 and before, so 8.2 is being left as an outlier.)	2011-10-31 16:40:04 -04:00
Tom Lane	6743a878a4	Support more locale-specific formatting options in cash_out(). The POSIX spec defines locale fields for controlling the ordering of the value, sign, and currency symbol in monetary output, but cash_out only supported a small subset of these options. Fully implement p/n_sign_posn, p/n_cs_precedes, and p/n_sep_by_space per spec. Fix up cash_in so that it will accept all these format variants. Also, make sure that thousands_sep is only inserted to the left of the decimal point, as required by spec. Per bug #6144 from Eduard Kracmar and discussion of bug #6277. This patch includes some ideas from Alexander Lakhin's proposed patch, though it is very different in detail.	2011-10-30 15:02:58 -04:00
Tom Lane	eb5834d5af	Further improvement of make_greater_string. Make sure that it considers all the possibilities that the old code did, instead of trying only one possibility per character position. To keep the runtime in bounds, instead tweak the character incrementers to not try every possible multibyte character code. Remove unnecessary logic to restore the old character value on failure. Additional comment and formatting cleanup.	2011-10-30 12:22:11 -04:00
Robert Haas	fae54e4a16	Update visibilitymap.c header comments. Recent work on index-only scans left this somewhat out of date.	2011-10-29 14:46:59 -04:00
Tom Lane	7609239f3e	Fix assorted bogosities in cash_in() and cash_out(). cash_out failed to handle multiple-byte thousands separators, as per bug #6277 from Alexander Law. In addition, cash_in didn't handle that either, nor could it handle multiple-byte positive_sign. Both routines failed to support multiple-byte mon_decimal_point, which I did not think was worth changing, but at least now they check for the possibility and fall back to using '.' rather than emitting invalid output. Also, make cash_in handle trailing negative signs, which formerly it would reject. Since cash_out generates trailing negative signs whenever the locale tells it to, this last omission represents a fail-to-reload-dumped-data bug. IMO that justifies patching this all the way back.	2011-10-29 14:32:06 -04:00
Robert Haas	78d523b633	Improve make_greater_string() with encoding-specific incrementers. This infrastructure doesn't in any way guarantee that the character we produce will sort before the one we incremented; but it does at least make it much more likely that we'll end up with something that is a valid character, which improves our chances. Kyotaro Horiguchi, with various adjustments by me.	2011-10-29 14:22:20 -04:00
Robert Haas	53f1ca59b5	Allow hint bits to be set sooner for temporary and unlogged tables. We need not wait until the commit record is durably on disk, because in the event of a crash the page we're updating with hint bits will be gone anyway. Per off-list report from Heikki Linnakangas, this can significantly degrade the performance of unlogged tables; I was able to show a 2x speedup from this patch on a pgbench run with scale factor 15. In practice, this will mostly help small, heavily updated tables, because on larger tables you're unlikely to run into the same row again before the commit record makes it out to disk.	2011-10-28 17:08:09 -04:00
Heikki Linnakangas	cbf65509bb	Fix the number of lwlocks needed by the "fast path" lock patch. It needs one lock per backend or auxiliary process - the need for a lock for each aux processes was not accounted for in NumLWLocks(). No-one noticed, because the three locks needed for the three aux processes fit into the few extra lwlocks we allocate for 3rd party modules that don't call RequestAddinLWLocks() (NUM_USER_DEFINED_LWLOCKS, 4 by default).	2011-10-27 22:39:58 +03:00
Tom Lane	3e4b3465b6	Improve planner's ability to recognize cases where an IN's RHS is unique. If the right-hand side of a semijoin is unique, then we can treat it like a normal join (or another way to say that is: we don't need to explicitly unique-ify the data before doing it as a normal join). We were recognizing such cases when the RHS was a sub-query with appropriate DISTINCT or GROUP BY decoration, but there's another way: if the RHS is a plain relation with unique indexes, we can check if any of the indexes prove the output is unique. Most of the infrastructure for that was there already in the join removal code, though I had to rearrange it a bit. Per reflection about a recent example in pgsql-performance.	2011-10-26 17:52:29 -04:00
Tom Lane	1e3b21dd5e	Change FK trigger naming convention to fix self-referential FKs. Use names like "RI_ConstraintTrigger_a_NNNN" for FK action triggers and "RI_ConstraintTrigger_c_NNNN" for FK check triggers. This ensures the action trigger fires first in self-referential cases where the very same row update fires both an action and a check trigger. This change provides a non-probabilistic solution for bug #6268, at the risk that it could break client code that is making assumptions about the exact names assigned to auto-generated FK triggers. Hence, change this in HEAD only. No need for forced initdb since old triggers continue to work fine.	2011-10-26 13:19:42 -04:00
Tom Lane	58958726ff	Change FK trigger creation order to better support self-referential FKs. When a foreign-key constraint references another column of the same table, row updates will queue both the PK's ON UPDATE action and the FK's CHECK action in the same event. The ON UPDATE action must execute first, else the CHECK will check a non-final state of the row and possibly throw an inappropriate error, as seen in bug #6268 from Roman Lytovchenko. Now, the firing order of multiple triggers for the same event is determined by the sort order of their pg_trigger.tgnames, and the auto-generated names we use for FK triggers are "RI_ConstraintTrigger_NNNN" where NNNN is the trigger OID. So most of the time the firing order is the same as creation order, and so rearranging the creation order fixes it. This patch will fail to fix the problem if the OID counter wraps around or adds a decimal digit (eg, from 99999 to 100000) while we are creating the triggers for an FK constraint. Given the small odds of that, and the low usage of self-referential FKs, we'll live with that solution in the back branches. A better fix is to change the auto-generated names for FK triggers, but it seems unwise to do that in stable branches because there may be client code that depends on the naming convention. We'll fix it that way in HEAD in a separate patch. Back-patch to all supported branches, since this bug has existed for a long time.	2011-10-26 13:02:28 -04:00
Magnus Hagander	a87b9ae161	Make event_source visible on all platforms On non-windows platform, we just ignore any value set there. Noted by Jaime Casanova	2011-10-25 22:40:58 +02:00
Magnus Hagander	d8ea33f2c0	Support configurable eventlog application names on Windows This allows different instances to use the eventlog with different identifiers, by setting the event_source GUC, similar to how syslog_ident works. Original patch by MauMau, heavily modified by Magnus Hagander	2011-10-25 20:02:55 +02:00
Tom Lane	0f39d5050d	Don't trust deferred-unique indexes for join removal. The uniqueness condition might fail to hold intra-transaction, and assuming it does can give incorrect query results. Per report from Marti Raudsepp, though this is not his proposed patch. Back-patch to 9.0, where both these features were introduced. In the released branches, add the new IndexOptInfo field to the end of the struct, to try to minimize ABI breakage for third-party code that may be examining that struct.	2011-10-23 00:43:39 -04:00
Tom Lane	bb446b689b	Support synchronization of snapshots through an export/import procedure. A transaction can export a snapshot with pg_export_snapshot(), and then others can import it with SET TRANSACTION SNAPSHOT. The data does not leave the server so there are not security issues. A snapshot can only be imported while the exporting transaction is still running, and there are some other restrictions. I'm not totally convinced that we've covered all the bases for SSI (true serializable) mode, but it works fine for lesser isolation modes. Joachim Wieland, reviewed by Marko Tiikkaja, and rather heavily modified by Tom Lane	2011-10-22 18:23:30 -04:00
Heikki Linnakangas	b436c72f61	Fix overly-complicated usage of errcode_for_file_access(). No need to do "errcode(errcode_for_file_access())", just "errcode_for_file_access()" is enough. The extra errcode() call is useless but harmless, so there's no user-visible bug here. Nevertheless, backpatch to 9.1 where this code were added.	2011-10-22 20:19:50 +03:00
Tom Lane	f9c92a5a3e	Code review for pgstat_get_crashed_backend_activity patch. Avoid possibly dumping core when pgstat_track_activity_query_size has a less-than-default value; avoid uselessly searching for the query string of a successfully-exited backend; don't bother putting out an ERRDETAIL if we don't have a query to show; some other minor stylistic improvements.	2011-10-21 16:36:04 -04:00
Tom Lane	5ac5980744	More cleanup after failed reduced-lock-levels-for-DDL feature. Turns out that use of ShareUpdateExclusiveLock or ShareRowExclusiveLock to protect DDL changes had gotten copied into several places that were not touched by either of Simon's original patches for the feature, and thus neither he nor I thought to revert them. (Indeed, it appears that two of these uses were committed after the reversion, which just goes to show that git merging is no panacea.) Change these places to use AccessExclusiveLock again. If we ever manage to resurrect that feature, we're going to have to think a bit harder about how to keep lock level usage in sync for DDL operations that aren't within the AlterTable infrastructure. Two of these bugs are only in HEAD, but one is in the 9.1 branch too. Alvaro found one of them, I found the other two.	2011-10-21 13:50:30 -04:00
Robert Haas	c8e8b5a6e2	Try to log current the query string when a backend crashes. To avoid minimize risk inside the postmaster, we subject this feature to a number of significant limitations. We very much wish to avoid doing any complex processing inside the postmaster, due to the posssibility that the crashed backend has completely corrupted shared memory. To that end, no encoding conversion is done; instead, we just replace anything that doesn't look like an ASCII character with a question mark. We limit the amount of data copied to 1024 characters, and carefully sanity check the source of that data. While these restrictions would doubtless be unacceptable in a general-purpose logging facility, even this limited facility seems like an improvement over the status quo ante. Marti Raudsepp, reviewed by PDXPUG and myself	2011-10-21 13:26:40 -04:00
Robert Haas	980261929f	Fix DROP OPERATOR FAMILY IF EXISTS. Essentially, the "IF EXISTS" portion was being ignored, and an error thrown anyway if the opfamily did not exist. I broke this in commit fd1843ff8979c0461fb3f1a9eab61140c977e32d; so backpatch to 9.1.X. Report and diagnosis by KaiGai Kohei.	2011-10-21 09:12:23 -04:00
Tom Lane	b4a0223d00	Simplify and improve ProcessStandbyHSFeedbackMessage logic. There's no need to clamp the standby's xmin to be greater than GetOldestXmin's result; if there were any such need this logic would be hopelessly inadequate anyway, because it fails to account for within-database versus cluster-wide values of GetOldestXmin. So get rid of that, and just rely on sanity-checking that the xmin is not wrapped around relative to the nextXid counter. Also, don't reset the walsender's xmin if the current feedback xmin is indeed out of range; that just creates more problems than we already had. Lastly, don't bother to take the ProcArrayLock; there's no need to do that to set xmin. Also improve the comments about this in GetOldestXmin itself.	2011-10-20 19:43:31 -04:00
Robert Haas	8f3362d4b7	Fix get_object_namespace() not to think extensions are "in" a schema. extnamespace means something altogether different in this context. Mostly by accident, this coding error (introduced in my commit `82a4a777d9`) broke the buildfarm instead of just silently doing the wrong thing.	2011-10-20 00:07:41 -04:00
Robert Haas	1d751018d8	Add "skipping" to the NOTICE produced by DROP OPERATOR CLASS IF EXISTS. This makes this message consistent with all the other similar notices produced by other DROP IF EXISTS commands. Noted by KaiGai Kohei	2011-10-19 23:45:31 -04:00
Robert Haas	82a4a777d9	Consolidate DROP handling for some object types. This gets rid of a significant amount of duplicative code. KaiGai Kohei, reviewed in earlier versions by Dimitri Fontaine, with further review and cleanup by me.	2011-10-19 23:27:19 -04:00
Tom Lane	aa90e148ca	Suppress -Wunused-result warnings about write() and fwrite(). This is merely an exercise in satisfying pedants, not a bug fix, because in every case we were checking for failure later with ferror(), or else there was nothing useful to be done about a failure anyway. Document the latter cases.	2011-10-18 21:37:51 -04:00
Tom Lane	e27f52f3a1	Reject empty pg_hba.conf files. An empty HBA file is surely an error, since it means there is no way to connect to the server. We've not heard identifiable reports of people actually doing that, but this will also close off the case Thom Brown just complained of, namely pointing hba_file at a directory. (On at least some platforms with some directories, it will read as an empty file.) Perhaps this should be back-patched, but given the lack of previous complaints, I won't add extra work for the translators.	2011-10-18 20:09:18 -04:00
Magnus Hagander	d1e25b78f9	Exclude postmaster.opts from base backups Noted by Fujii Masao	2011-10-18 15:58:37 +02:00
Tom Lane	336c1d7a51	Avoid assuming that index-only scan data matches the index's rowtype. In general the data returned by an index-only scan should have the datatypes originally computed by FormIndexDatum. If the index opclasses use "storage" datatypes different from their input datatypes, the scan tuple will not have the same rowtype attributed to the index; but we had a hard-wired assumption that that was true in nodeIndexonlyscan.c. We'd already hacked around the issue for the one case where the types are different in btree indexes (btree name_ops), but this would definitely come back to bite us if we ever implement index-only scans in GiST. To fix, require the index AM to explicitly provide the tupdesc for the tuple it is returning. btree can just pass back the index's tupdesc, but GiST will have to work harder when and if it supports index-only scans. I had previously proposed fixing this by allowing the index AM to fill the scan tuple slot directly; but on reflection that seemed like a module layering violation, since TupleTableSlots are creatures of the executor. At least in the btree case, it would also be less efficient, since the tuple deconstruction work would occur even for rows later found to be invisible to the scan's snapshot.	2011-10-16 19:15:04 -04:00
Tom Lane	9e8da0f757	Teach btree to handle ScalarArrayOpExpr quals natively. This allows "indexedcol op ANY(ARRAY[...])" conditions to be used in plain indexscans, and particularly in index-only scans.	2011-10-16 15:39:24 -04:00
Tom Lane	d26e1ebaf5	Fix bugs in information_schema.referential_constraints view. This view was being insufficiently careful about matching the FK constraint to the depended-on primary or unique key constraint. That could result in failure to show an FK constraint at all, or showing it multiple times, or claiming that it depended on a different constraint than the one it really does. Fix by joining via pg_depend to ensure that we find only the correct dependency. Back-patch, but don't bump catversion because we can't force initdb in back branches. The next minor-version release notes should explain that if you need to fix this in an existing installation, you can drop the information_schema schema then re-create it by sourcing $SHAREDIR/information_schema.sql in each database (as a superuser of course).	2011-10-14 20:24:17 -04:00
Tom Lane	e6858e6657	Measure the number of all-visible pages for use in index-only scan costing. Add a column pg_class.relallvisible to remember the number of pages that were all-visible according to the visibility map as of the last VACUUM (or ANALYZE, or some other operations that update pg_class.relpages). Use relallvisible/relpages, instead of an arbitrary constant, to estimate how many heap page fetches can be avoided during an index-only scan. This is pretty primitive and will no doubt see refinements once we've acquired more field experience with the index-only scan mechanism, but it's way better than using a constant. Note: I had to adjust an underspecified query in the window.sql regression test, because it was changing answers when the plan changed to use an index-only scan. Some of the adjacent tests perhaps should be adjusted as well, but I didn't do that here.	2011-10-14 17:23:46 -04:00
Robert Haas	393e828e31	Avoid potential relcache leak in objectaddress.c. Nobody using the missing_ok flag yet, but let's speculate that this will be a better interface for future callers. KaiGai Kohei, with some adjustments by me.	2011-10-14 11:35:40 -04:00
Bruce Momjian	0180bd6180	Remove all "traces" of trace_userlocks, because userlocks were removed in PG 8.2.	2011-10-13 19:59:57 -04:00
Tom Lane	7b96519fe2	Don't mark auto-generated types as extension members. Relation rowtypes and automatically-generated array types do not need to have their own extension membership dependency entries. If we create such then it becomes more difficult to remove items from an extension, and it's also harder for an extension upgrade script to make sure it duplicates the dependencies created by the extension's regular installation script. I changed the code in such a way that this happened in commit `988cccc620`, I think because of worries about the shell-type-replacement case; but that cure was worse than the disease. It would only matter if one extension created a shell type that was replaced with an auto-generated type in another extension, which seems pretty far-fetched. Better to make this work unsurprisingly in normal cases. Report and patch by Robert Haas, comment adjustments by me.	2011-10-12 18:41:49 -04:00
Bruce Momjian	484af9b376	Modify RelationGetBufferForTuple() to use a typedef, rather than a struct, to help pgindent.	2011-10-12 16:53:54 -04:00
Tom Lane	458857cc9d	Throw a useful error message if an extension script file is fed to psql. We have seen one too many reports of people trying to use 9.1 extension files in the old-fashioned way of sourcing them in psql. Not only does that usually not work (due to failure to substitute for MODULE_PATHNAME and/or @extschema@), but if it did work they'd get a collection of loose objects not an extension. To prevent this, insert an \echo ... \quit line that prints a suitable error message into each extension script file, and teach commands/extension.c to ignore lines starting with \echo. That should not only prevent any adverse consequences of loading a script file the wrong way, but make it crystal clear to users that they need to do it differently now. Tom Lane, following an idea of Andrew Dunstan's. Back-patch into 9.1 ... there is not going to be much value in this if we wait till 9.2.	2011-10-12 15:45:03 -04:00
Tom Lane	8c8ba6d11b	Add comment on why pulling data from a "name" index column can't crash. It's been bothering me for several days that pretending that the cstring data stored in a btree name_ops column is really a "name" Datum could lead to reading past the end of memory. However, given the current memory layout used for index-only scans in the btree code, a crash is in fact not possible. Document that so we don't break it. I have not thought of any other solutions that aren't fairly ugly too, and most of them lose the functionality of index-only scans on name columns altogether, so this seems like the way to go.	2011-10-11 18:40:53 -04:00
Tom Lane	cb6771fb32	Generate index-only scan tuple descriptor from the plan node's indextlist. Dept. of second thoughts: as long as we've got that tlist hanging around anyway, we can apply ExecTypeFromTL to it to get a suitable descriptor for the ScanTupleSlot. This is a nicer solution than the previous one because it eliminates some hard-wired knowledge about btree name_ops, and because it avoids the somewhat shaky assumption that we needn't set up the scan tuple descriptor in EXPLAIN_ONLY mode. It doesn't change what actually happens at run-time though, and I'm still a bit nervous about that.	2011-10-11 18:12:57 -04:00
Tom Lane	600d3206d1	Consider index-only scans even when there is no matching qual or ORDER BY. By popular demand.	2011-10-11 15:00:30 -04:00
Tom Lane	a0185461dd	Rearrange the implementation of index-only scans. This commit changes index-only scans so that data is read directly from the index tuple without first generating a faux heap tuple. The only immediate benefit is that indexes on system columns (such as OID) can be used in index-only scans, but this is necessary infrastructure if we are ever to support index-only scans on expression indexes. The executor is now ready for that, though the planner still needs substantial work to recognize the possibility. To do this, Vars in index-only plan nodes have to refer to index columns not heap columns. I introduced a new special varno, INDEX_VAR, to mark such Vars to avoid confusion. (In passing, this commit renames the two existing special varnos to OUTER_VAR and INNER_VAR.) This allows ruleutils.c to handle them with logic similar to what we use for subplan reference Vars. Since index-only scans are now fundamentally different from regular indexscans so far as their expression subtrees are concerned, I also chose to change them to have their own plan node type (and hence, their own executor source file).	2011-10-11 14:21:30 -04:00
Robert Haas	fa351d5a0d	Replace hardcoded switch in object_exists() with a lookup table. There's no particular advantage to this change on its face; indeed, it's possible that this might be slightly slower than the old way. But it makes this information more easily accessible to other functions, and therefore paves the way for future code consolidation. Performance isn't critical here, so there's no need to be smart about how we do the search. This is a heavily cut-down version of a patch from KaiGai Kohei, with several fixes by me. Additional review from Dimitri Fontaine.	2011-10-11 09:14:30 -04:00
Robert Haas	e76bcaba9c	Repair breakage in VirtualXactLock. I broke this in commit `84e3712677`. Report and fix by Fujii Masao.	2011-10-11 07:39:09 -04:00
Bruce Momjian	e26d5fcd94	Mark GUC external_pid_file's default as '' in postgresql.conf, rather than '(none)'.	2011-10-10 08:17:10 -04:00
Robert Haas	c0f03aae04	Fix ALTER TABLE ONLY .. DROP CONSTRAINT. When I consolidated two copies of the HOT-chain search logic in commit `4da99ea423`, I introduced a behavior change: the old code wouldn't necessarily traverse the entire chain, if the most recently returned tuple were updated while the HOT chain traversal is in progress. The new behavior seems more correct, but unfortunately, the code here relies on a scan with SnapshotNow failing to see its own updates. That seems pretty shaky even with the old HOT chain traversal behavior, since there's no guarantee that these updates will always be HOT, but it's trivial to broke a failure with the new HOT search logic. Fix by updating just the first matching pg_constraint tuple, rather than all of them, since there should be only one anyway. But since nobody has reproduced this failure on older versions, no back-patch for now. Report and test case by Alex Hunsaker; tablecmds.c changes by me.	2011-10-09 23:39:52 -04:00
Heikki Linnakangas	d50e125194	Clean up a couple of box gist helper functions. The original idea of this patch was to make box picksplit run faster, by eliminating unnecessary palloc() overhead, but that was obsoleted by the new double-sorting split algorithm that doesn't call these functions so heavily anymore. Nevertheless, the code looks better this way. Original patch by me, reviewed and tidied up after the double-sorting patch by Kevin Grittner.	2011-10-09 18:59:34 +03:00
Tom Lane	cbfa92c23c	Improve index-only scans to avoid repeated access to the index page. We copy all the matched tuples off the page during _bt_readpage, instead of expensively re-locking the page during each subsequent tuple fetch. This costs a bit more local storage, but not more than 2*BLCKSZ worth, and the reduction in LWLock traffic is certainly worth that. What's more, this lets us get rid of the API wart in the original patch that said an index AM could randomly decline to supply an index tuple despite having asserted pg_am.amcanreturn. That will be important for future improvements in the index-only-scan feature, since the executor will now be able to rely on having the index data available.	2011-10-09 00:21:08 -04:00
Tom Lane	b324384f6b	Fix brain fade in cost estimation for index-only scans. visibility_fraction should not be applied to regular indexscans. Noted by Cédric Villemain.	2011-10-08 10:41:17 -04:00
Heikki Linnakangas	1ef60dab70	Don't let transform_null_equals=on affect CASE foo WHEN NULL ... constructs. transform_null_equals is only supposed to affect "foo = NULL" expressions given directly by the user, not the internal "foo = NULL" expression generated from CASE-WHEN. This fixes bug #6242, reported by Sergey. Backpatch to all supported branches.	2011-10-08 11:17:40 +03:00
Tom Lane	a2822fb933	Support index-only scans using the visibility map to avoid heap fetches. When a btree index contains all columns required by the query, and the visibility map shows that all tuples on a target heap page are visible-to-all, we don't need to fetch that heap page. This patch depends on the previous patches that made the visibility map reliable. There's a fair amount left to do here, notably trying to figure out a less chintzy way of estimating the cost of an index-only scan, but the core functionality seems ready to commit. Robert Haas and Ibrar Ahmed, with some previous work by Heikki Linnakangas.	2011-10-07 20:14:13 -04:00
Magnus Hagander	7aeff9f4a4	Ensure walsenders can be SIGTERMed while in non-walsender code In oder to exit on SIGTERM when in non-walsender code, such as do_pg_stop_backup(), we need to set the interrupt variables that are used there, and not just the walsender local ones.	2011-10-06 21:43:14 +02:00
Bruce Momjian	aaa6e1def2	Add postmaster -C option to query configuration parameters, and have pg_ctl use that to query the data directory for config-only installs. This fixes awkward or impossible pg_ctl operation for config-only installs.	2011-10-06 09:38:39 -04:00
Heikki Linnakangas	7f3bd86843	Replace the "New Linear" GiST split algorithm for boxes and points with a new double-sorting algorithm. The new algorithm produces better quality trees, making searches faster. Alexander Korotkov	2011-10-06 10:03:46 +03:00
Tom Lane	ba6f629326	Improve and simplify CREATE EXTENSION's management of GUC variables. CREATE EXTENSION needs to transiently set search_path, as well as client_min_messages and log_min_messages. We were doing this by the expedient of saving the current string value of each variable, doing a SET LOCAL, and then doing another SET LOCAL with the previous value at the end of the command. This is a bit expensive though, and it also fails badly if there is anything funny about the existing search_path value, as seen in a recent report from Roger Niederland. Fortunately, there's a much better way, which is to piggyback on the GUC infrastructure previously developed for functions with SET options. We just open a new GUC nesting level, do our assignments with GUC_ACTION_SAVE, and then close the nesting level when done. This automatically restores the prior settings without a re-parsing pass, so (in principle anyway) there can't be an error. And guc.c still takes care of cleanup in event of an error abort. The CREATE EXTENSION code for this was modeled on some much older code in ri_triggers.c, which I also changed to use the better method, even though there wasn't really much risk of failure there. Also improve the comments in guc.c to reflect this additional usage.	2011-10-05 20:44:16 -04:00
Tom Lane	41e461d36f	Improve define_custom_variable's handling of pre-existing settings. Arrange for any problems with pre-existing settings to be reported as WARNING not ERROR, so that we don't undesirably abort the loading of the incoming add-on module. The bad setting is just discarded, as though it had never been applied at all. (This requires a change in the API of set_config_option. After some thought I decided the most potentially useful addition was to allow callers to just pass in a desired elevel.) Arrange to restore the complete stacked state of the variable, rather than cheesily reinstalling only the active value. This ensures that custom GUCs will behave unsurprisingly even when the module loading operation occurs within nested subtransactions that have changed the active value. Since a module load could occur as a result of, eg, a PL function call, this is not an unlikely scenario.	2011-10-04 19:57:21 -04:00
Tom Lane	fa56a0c3e0	Fix uninitialized-variable bug.	2011-10-04 17:08:18 -04:00
Tom Lane	4bcb82a7d5	Add sourcefile/sourceline data to EXEC_BACKEND GUC transmission files. This oversight meant that on Windows, the pg_settings view would not display source file or line number information for values coming from postgresql.conf, unless the backend had received a SIGHUP since starting. In passing, also make the error detection in read_nondefault_variables a tad more thorough, and fix it to not lose precision on float GUCs (these changes are already in HEAD as of my previous commit).	2011-10-04 16:47:48 -04:00
Tom Lane	9f5836d224	Remember the source GucContext for each GUC parameter. We used to just remember the GucSource, but saving GucContext too provides a little more information --- notably, whether a SET was done by a superuser or regular user. This allows us to rip out the fairly dodgy code that define_custom_variable used to use to try to infer the context to re-install a pre-existing setting with. In particular, it now works for a superuser to SET a extension's SUSET custom variable before loading the associated extension, because GUC can remember whether the SET was done as a superuser or not. The plperl regression tests contain an example where this is useful.	2011-10-04 16:13:50 -04:00
Alvaro Herrera	09e196e453	Use callbacks in SlruScanDirectory for the actual action Previously, the code assumed that the only possible action to take was to delete files behind a certain cutoff point. The async notify code was already a crock: it used a different "pagePrecedes" function for truncation than for regular operation. By allowing it to pass a callback to SlruScanDirectory it can do cleanly exactly what it needs to do. The clog.c code also had its own use for SlruScanDirectory, which is made a bit simpler with this.	2011-10-04 14:03:23 -03:00
Tom Lane	1a00c0ef53	Remove the custom_variable_classes parameter. This variable provides only marginal error-prevention capability (since it can only check the prefix of a qualified GUC name), and the consensus is that that isn't worth the amount of hassle that maintaining the setting creates for DBAs. So, let's just remove it. With this commit, the system will silently accept a value for any qualified GUC name at all, whether it has anything to do with any known extension or not. (Unqualified names still have to match known built-in settings, though; and you will get a WARNING at extension load time if there's an unrecognized setting with that extension's prefix.) There's still some discussion ongoing about whether to tighten that up and if so how; but if we do come up with a solution, it's not likely to look anything like custom_variable_classes.	2011-10-04 12:36:55 -04:00
Tom Lane	76074fcaa0	ProcedureCreate neglected to record dependencies on default expressions. Thus, an object referenced in a default expression could be dropped while the function remained present. This was unaccountably missed in the original patch to add default parameters for functions. Reported by Pavel Stehule.	2011-10-03 12:13:15 -04:00
Tom Lane	d56b3afc03	Restructure error handling in reading of postgresql.conf. This patch has two distinct purposes: to report multiple problems in postgresql.conf rather than always bailing out after the first one, and to change the policy for whether changes are applied when there are unrelated errors in postgresql.conf. Formerly the policy was to apply no changes if any errors could be detected, but that had a significant consistency problem, because in some cases specific values might be seen as valid by some processes but invalid by others. This meant that the latter processes would fail to adopt changes in other parameters even though the former processes had done so. The new policy is that during SIGHUP, the file is rejected as a whole if there are any errors in the "name = value" syntax, or if any lines attempt to set nonexistent built-in parameters, or if any lines attempt to set custom parameters whose prefix is not listed in (the new value of) custom_variable_classes. These tests should always give the same results in all processes, and provide what seems a reasonably robust defense against loading values from badly corrupted config files. If these tests pass, all processes will apply all settings that they individually see as good, ignoring (but logging) any they don't. In addition, the postmaster does not abandon reading a configuration file after the first syntax error, but continues to read the file and report syntax errors (up to a maximum of 100 syntax errors per file). The postmaster will still refuse to start up if the configuration file contains any errors at startup time, but these changes allow multiple errors to be detected and reported before quitting. Alexey Klyukin, reviewed by Andy Colson and av (Alexander ?) with some additional hacking by Tom Lane	2011-10-02 16:50:04 -04:00
Tom Lane	5ec6b7f1b8	Improve generated column names for cases involving sub-SELECTs. We'll now use "exists" for EXISTS(SELECT ...), "array" for ARRAY(SELECT ...), or the sub-select's own result column name for a simple expression sub-select. Previously, you usually got "?column?" in such cases. Marti Raudsepp, reviewed by Kyotaro Horiugchi	2011-10-01 14:01:46 -04:00
Tom Lane	d22a09dc70	Support GiST index support functions that want to cache data across calls. pg_trgm was already doing this unofficially, but the implementation hadn't been thought through very well and leaked memory. Restructure the core GiST code so that it actually works, and document it. Ordinarily this would have required an extra memory context creation/destruction for each GiST index search, but I was able to avoid that in the normal case of a non-rescanned search by finessing the handling of the RBTree. It used to have its own context always, but now shares a context with the scan-lifespan data structures, unless there is more than one rescan call. This should make the added overhead unnoticeable in typical cases.	2011-09-30 19:48:57 -04:00
Tom Lane	79edb2b1dc	Fix recursion into previously planned sub-query in examine_simple_variable. This code was looking at the sub-Query tree as seen in the parent query's RangeTblEntry; but that's the pristine parser output, and what we need to look at is the tree as it stands at the completion of planning. Otherwise we might pick up a Var that references a subquery that got flattened and hence has no RelOptInfo in the subroot. Per report from Peter Geoghegan.	2011-09-29 18:13:16 -04:00
Bruce Momjian	054219c907	Fix pg_upgrade for EXEC_BACKEND builds (e.g. Windows) by properly passing the -b/binary-upgrade flag. Backpatch to 9.1.X.	2011-09-29 17:21:34 -04:00
Tom Lane	cb37c29106	Fix index matching for operators with mixed collatable/noncollatable inputs. If an indexable operator for a non-collatable indexed datatype has a collatable right-hand input type, any OpExpr for it will be marked with a nonzero inputcollid (since having one collatable input is sufficient to make that happen). However, an index on a non-collatable column certainly doesn't have any collation. This caused us to fail to match such operators to their indexes, because indxpath.c required an exact match of index collation and clause collation. It seems correct to allow a match when the index is collation-less regardless of the clause's inputcollid: an operator with both noncollatable and collatable inputs could perhaps depend on the collation of the collatable input, but it could hardly expect the index for the noncollatable input to have that same collation. Per bug #6232 from Pierre Ducroquet. His example is specifically about "hstore ? text" but the problem seems quite generic.	2011-09-29 00:43:42 -04:00
Robert Haas	f70648d5a1	Update comments related to the crash-safety of the visibility map. In hio.c, document how we avoid deadlock with respect to visibility map buffer locks. In visibilitymap.c, update the LOCKING section of the file header comment. Both oversights noted by Heikki Linnakangas.	2011-09-27 09:30:23 -04:00
Robert Haas	624f155ffa	heap_update() must recheck tuple after unlocking and relocking buffer. Bug found by Alvaro Herrera, fix suggested by Heikki Linnakangas and reviewed by Tom Lane.	2011-09-27 08:24:18 -04:00
Tom Lane	269c5dd2f4	Fix window functions that sort by expressions involving aggregates. In commit `c1d9579dd8`, I changed things so that the output of the Agg node that feeds the window functions would not list any ungrouped Vars directly. Formerly, for example, the Agg tlist might have included both "x" and "sum(x)", which is not really valid if "x" isn't a grouping column. If we then had a window function ordering on something like "sum(x) + 1", prepare_sort_from_pathkeys would find no exact match for this in the Agg tlist, and would conclude that it must recompute the expression. But it would break the expression down to just the Var "x", which it would find in the tlist, and then rebuild the ORDER BY expression using a reference to the subplan's "x" output. Now, after the above-referenced changes, "x" isn't in the Agg tlist if it's not a grouping column, so that prepare_sort_from_pathkeys fails with "could not find pathkey item to sort", as reported by Bricklen Anderson. The fix is to not break down Aggrefs into their component parts, but just treat them as irreducible expressions to be sought in the subplan tlist. This is definitely OK for the use with respect to window functions in grouping_planner, since it just built the tlist being used on the same basis. AFAICT it is safe for other uses too; most of the other call sites couldn't encounter Aggrefs anyway.	2011-09-26 23:48:39 -04:00
Tom Lane	57eb009092	Allow snapshot references to still work during transaction abort. In REPEATABLE READ (nee SERIALIZABLE) mode, an attempt to do GetTransactionSnapshot() between AbortTransaction and CleanupTransaction failed, because GetTransactionSnapshot would recompute the transaction snapshot (which is already wrong, given the isolation mode) and then re-register it in the TopTransactionResourceOwner, leading to an Assert because the TopTransactionResourceOwner should be empty of resources after AbortTransaction. This is the root cause of bug #6218 from Yamamoto Takashi. While changing plancache.c to avoid requesting a snapshot when handling a ROLLBACK masks the problem, I think this is really a snapmgr.c bug: it's lower-level than the resource manager mechanism and should not be shutting itself down before we unwind resource manager resources. However, just postponing the release of the transaction snapshot until cleanup time didn't work because of the circular dependency with TopTransactionResourceOwner. Fix by managing the internal reference to that snapshot manually instead of depending on TopTransactionResourceOwner. This saves a few cycles as well as making the module layering more straightforward. predicate.c's dependencies on TopTransactionResourceOwner go away too. I think this is a longstanding bug, but there's no evidence that it's more than a latent bug, so it doesn't seem worth any risk of back-patching.	2011-09-26 22:25:28 -04:00
Robert Haas	821fd903f9	Update obsolete comments. This was partially fixed by `57fdb2b0d8`, back in 2005, but it missed a couple of spots. YAMAMOTO Takashi	2011-09-26 13:12:22 -04:00
Tom Lane	21fb95da46	Use a fresh copy of query_list when making a second plan in GetCachedPlan. The code path that tried a generic plan, didn't like it, and then made a custom plan was mistakenly passing the same copy of the query_list to the planner both times. This doesn't work too well for nontrivial queries, since the planner tends to scribble on its input. Diagnosis and fix by Yamamoto Takashi.	2011-09-26 12:44:17 -04:00
Tom Lane	d5aa7a9fe6	Avoid unnecessary snapshot-acquisitions in BuildCachedPlan. I had copied-and-pasted a claim that we couldn't reach this point when dealing with utility statements, but that was a leftover from when the caller was required to supply a plan to start with. We now will go through here at least once when handling a utility statement, so it seems worth a check to see whether a snapshot is actually needed. (Note that analyze_requires_snapshot is quite a cheap test.) Per suggestion from Yamamoto Takashi. I don't think I believe that this resolves his reported assertion failure; but it's worth changing anyway, just to save a cycle or two.	2011-09-25 17:34:20 -04:00
Tom Lane	7741dd6590	Recognize self-contradictory restriction clauses for non-table relations. The constraint exclusion feature checks for contradictions among scan restriction clauses, as well as contradictions between those clauses and a table's CHECK constraints. The first aspect of this testing can be useful for non-table relations (such as subqueries or functions-in-FROM), but the feature was coded with only the CHECK case in mind so we were applying it only to plain-table RTEs. Move the relation_excluded_by_constraints call so that it is applied to all RTEs not just plain tables. With the default setting of constraint_exclusion this results in no extra work, but with constraint_exclusion = ON we will detect optimizations that we missed before (at the cost of more planner cycles than we expended before). Per a gripe from Gunnlaugur Þór Briem. Experimentation with his example also showed we were not being very bright about the case where constraint exclusion is proven within a subquery within UNION ALL, so tweak the code to allow set_append_rel_pathlist to recognize such cases.	2011-09-24 19:33:16 -04:00
Robert Haas	0c8eda6258	Memory barrier support for PostgreSQL. This is not actually used anywhere yet, but it gets the basic infrastructure in place. It is fairly likely that there are bugs, and support for some important platforms may be missing, so we'll need to refine this as we go along.	2011-09-23 17:52:43 -04:00
Tom Lane	f197272365	Make EXPLAIN ANALYZE report the numbers of rows rejected by filter steps. This provides information about the numbers of tuples that were visited but not returned by table scans, as well as the numbers of join tuples that were considered and discarded within a join plan node. There is still some discussion going on about the best way to report counts for outer-join situations, but I think most of what's in the patch would not change if we revise that, so I'm going to go ahead and commit it as-is. Documentation changes to follow (they weren't in the submitted patch either). Marko Tiikkaja, reviewed by Marc Cousin, somewhat revised by Tom	2011-09-22 11:30:11 -04:00
Robert Haas	4893552e21	Fix another bit of unlogged-table-induced breakage. Per bug #6205, reported by Abel Abraham Camarillo Ojeda. This isn't a particularly elegant fix, but I'm trying to minimize the chances of causing yet another round of breakage. Adjust regression tests to exercise this case.	2011-09-21 10:48:31 -04:00
Tom Lane	2562dcea81	Suppress "unused function" warning when not HAVE_LOCALE_T. Forgot to consider this case ...	2011-09-20 17:47:21 -04:00
Tom Lane	37d4fd2b9d	Improve reporting of newlocale() failures in CREATE COLLATION. The standardized errno code for "no such locale" failures is ENOENT, which we were just reporting at face value, viz "No such file or directory". Per gripe from Thom Brown, this might confuse users, so add an errdetail message to clarify what it means. Also, report newlocale() failures as ERRCODE_INVALID_PARAMETER_VALUE rather than using errcode_for_file_access(), since newlocale()'s errno values aren't necessarily tied directly to file access failures.	2011-09-20 13:23:40 -04:00
Tom Lane	c4ae968633	Fix Assert failure in new plancache code. The regression tests were failing with CLOBBER_CACHE_ALWAYS enabled, as reported by buildfarm member jaguar. There was an Assert in BuildCachedPlan that asserted that the CachedPlanSource hadn't been invalidated since we called RevalidateCachedQuery, which in theory can't happen because we are holding locks on all the relevant database objects. However, CLOBBER_CACHE_ALWAYS generates a false positive by making an invalidation happen anyway; and on reflection, that could also occur as a result of a badly-timed sinval reset due to queue overflow. We could just remove the Assert and forge ahead with the not-really-stale querytree, but it seems safer to do another RevalidateCachedQuery call just to make real sure everything's OK.	2011-09-17 01:47:33 -04:00
Tom Lane	99b5454167	Remove debug logging for pgstat wait timeout. This reverts commit `79b2ee20c8`, which proved to not be very informative; it looks like the "pgstat wait timeout" warnings in the buildfarm are just a symptom of running on heavily loaded machines, and there isn't any weird mechanism causing them to appear. To try to reduce the frequency of buildfarm failures from this effect, increase PGSTAT_MAX_WAIT_TIME from 5 seconds to 10. Also, arrange to not send a fresh inquiry message every single time through the loop, as that seems more likely to cause problems (by swamping the collector) than fix them. We'll now send an inquiry the first time through the delay loop, and every 640 msec thereafter.	2011-09-16 18:25:27 -04:00
Tom Lane	9d306c66e6	Avoid unnecessary page-level SSI lock check in heap_insert(). As observed by Heikki, we need not conflict on heap page locks during an insert; heap page locks are only aggregated tuple locks, they don't imply locking "gaps" as index page locks do. So we can avoid some unnecessary conflicts, and also do the SSI check while not holding exclusive lock on the target buffer. Kevin Grittner, reviewed by Jeff Davis. Back-patch to 9.1.	2011-09-16 14:47:20 -04:00
Tom Lane	0a6cc28500	gistendscan() forgot to free so->giststate. This oversight led to a massive memory leak --- upwards of 10KB per tuple --- during creation-time verification of an exclusion constraint based on a GIST index. In most other scenarios it'd just be a leak of 10KB that would be recovered at end of query, so not too significant; though perhaps the leak would be noticeable in a situation where a GIST index was being used in a nestloop inner indexscan. In any case, it's a real leak of long standing, so patch all supported branches. Per report from Harald Fuchs.	2011-09-16 04:27:49 -04:00
Tom Lane	e6faf910d7	Redesign the plancache mechanism for more flexibility and efficiency. Rewrite plancache.c so that a "cached plan" (which is rather a misnomer at this point) can support generation of custom, parameter-value-dependent plans, and can make an intelligent choice between using custom plans and the traditional generic-plan approach. The specific choice algorithm implemented here can probably be improved in future, but this commit is all about getting the mechanism in place, not the policy. In addition, restructure the API to greatly reduce the amount of extraneous data copying needed. The main compromise needed to make that possible was to split the initial creation of a CachedPlanSource into two steps. It's worth noting in particular that SPI_saveplan is now deprecated in favor of SPI_keepplan, which accomplishes the same end result with zero data copying, and no need to then spend even more cycles throwing away the original SPIPlan. The risk of long-term memory leaks while manipulating SPIPlans has also been greatly reduced. Most of this improvement is based on use of the recently-added MemoryContextSetParent primitive.	2011-09-16 00:43:52 -04:00
Alvaro Herrera	86822df9b5	Split walsender.h in public/private headers This dramatically cuts short the number of headers the public one brings into whatever includes it.	2011-09-13 21:42:49 -03:00
Tom Lane	6693c9a5ed	deflist_to_tuplestore dumped core on an option with no value. Make it return NULL for the option_value, instead. Per report from Frank van Vugt. Back-patch to 8.4 where this code was added.	2011-09-13 11:36:49 -04:00
Heikki Linnakangas	8caf6132c7	In the final emptying phase of the new GiST buffering build, set the queuedForEmptying flag correctly on buffer when adding it to the queue. Also, don't add buffer to the queue if it's there already. These were harmless oversights; failing to set the flag just means that a buffer might get added to the queue twice if more tuples are added to it (although that can't actually happen at this point because all the upper buffers have already been emptied), and having the same buffer twice in the emptying queue is harmless. But better be tidy.	2011-09-12 13:06:06 +03:00
Tom Lane	b0025bd957	Invent a new memory context primitive, MemoryContextSetParent. This function will be useful for altering the lifespan of a context after creation (for example, by creating it under a transient context and later reparenting it to belong to a long-lived context). It costs almost no new code, since we can refactor what was there. Per my proposal of yesterday.	2011-09-11 16:29:42 -04:00
Peter Eisentraut	1b81c2fe6e	Remove many -Wcast-qual warnings This addresses only those cases that are easy to fix by adding or moving a const qualifier or removing an unnecessary cast. There are many more complicated cases remaining.	2011-09-11 21:54:32 +03:00
Tom Lane	ca4af308c3	Simplify handling of the timezone GUC by making initdb choose the default. We were doing some amazingly complicated things in order to avoid running the very expensive identify_system_timezone() procedure during GUC initialization. But there is an obvious fix for that, which is to do it once during initdb and have initdb install the system-specific default into postgresql.conf, as it already does for most other GUC variables that need system-environment-dependent defaults. This means that the timezone (and log_timezone) settings no longer have any magic behavior in the server. Per discussion.	2011-09-09 17:59:11 -04:00
Tom Lane	a7801b62f2	Move Timestamp/Interval typedefs and basic macros into datatype/timestamp.h. As per my recent proposal, this refactors things so that these typedefs and macros are available in a header that can be included in frontend-ish code. I also changed various headers that were undesirably including utils/timestamp.h to include datatype/timestamp.h instead. Unsurprisingly, this showed that half the system was getting utils/timestamp.h by way of xlog.h. No actual code changes here, just header refactoring.	2011-09-09 13:23:41 -04:00
Tom Lane	d63de337f3	round() is not portable. Use rint().	2011-09-08 16:38:24 -04:00
Alvaro Herrera	295e7dc929	Tweak string for uniformity	2011-09-08 16:39:58 -03:00
Heikki Linnakangas	5edb24a898	Buffering GiST index build algorithm. When building a GiST index that doesn't fit in cache, buffers are attached to some internal nodes in the index. This speeds up the build by avoiding random I/O that would otherwise be needed to traverse all the way down the tree to the find right leaf page for tuple. Alexander Korotkov	2011-09-08 17:51:23 +03:00
Tom Lane	f0bedf3e45	Fix corner case bug in numeric to_char(). Trailing-zero stripping applied by the FM specifier could strip zeroes to the left of the decimal point, for a format with no digit positions after the decimal point (such as "FM999."). Reported and diagnosed by Marti Raudsepp, though I didn't use his patch.	2011-09-07 17:07:20 -04:00
Tom Lane	99155aaa33	Fix typo in error message. Per Euler Taveira de Oliveira.	2011-09-07 13:29:26 -04:00
Tom Lane	a7d9203cc4	Fix get_name_for_var_field() to deal with RECORD Params. With 9.1's use of Params to pass down values from NestLoop join nodes to their inner plans, it is possible for a Param to have type RECORD, in which case the set of fields comprising the value isn't determinable by inspection of the Param alone. However, just as with a Var of type RECORD, we can find out what we need to know if we can locate the expression that the Param represents. We already knew how to do this in get_parameter(), but I'd overlooked the need to be able to cope in get_name_for_var_field(), which led to EXPLAIN failing with "record type has not been registered". To fix, refactor the search code in get_parameter() so it can be used by both functions. Per report from Marti Raudsepp.	2011-09-07 13:01:36 -04:00
Bruce Momjian	029dfdf115	Fix to_date() and to_timestamp() to handle year masks of length < 4 so they wrap toward year 2020, rather than the inconsistent behavior we had before.	2011-09-07 09:47:51 -04:00
Simon Riggs	df383b03e6	Partially revoke attempt to improve performance with many savepoints. Maintain difference between subtransaction release and commit introduced by earlier patch.	2011-09-07 12:11:26 +01:00
Simon Riggs	dde70cc313	Emit cascaded standby message on shutdown only when appropriate. Adds additional test for active walsenders and closes a race condition for when we failover when a new walsender was connecting. Reported and fixed bu Fujii Masao. Review by Heikki Linnakangas	2011-09-07 09:09:47 +01:00
Tom Lane	db10f01baa	Improve comment about handling of temp tables in shared-inval code.	2011-09-06 17:06:54 -04:00
Peter Eisentraut	e6d800981e	Correct ancient logic mistake in assertion Found by gcc -Wlogical-op	2011-09-06 23:05:02 +03:00
Tom Lane	623f77e9d1	Avoid possibly accessing off the end of memory in SJIS2004 conversion. The code in shift_jis_20042euc_jis_2004() would fetch two bytes even when only one remained in the string. Since conversion functions aren't supposed to assume null-terminated input, this poses a small risk of fetching past the end of memory and incurring SIGSEGV. No such crash has been identified in the field, but we've certainly seen the equivalent happen in other code paths, so patch this one all the way back. Report and patch by Noah Misch.	2011-09-06 14:50:28 -04:00
Tom Lane	780a342c90	Avoid possibly accessing off the end of memory in examine_attribute(). Since the last couple of columns of pg_type are often NULL, sizeof(FormData_pg_type) can be an overestimate of the actual size of the tuple data part. Therefore memcpy'ing that much out of the catalog cache, as analyze.c was doing, poses a small risk of copying past the end of memory and incurring SIGSEGV. No such crash has been identified in the field, but we've certainly seen the equivalent happen in other code paths, so patch this one all the way back. Per valgrind testing by Noah Misch, though this is not his proposed patch. I chose to use SearchSysCacheCopy1 rather than inventing special-purpose infrastructure for copying only the minimal part of a pg_type tuple.	2011-09-06 14:37:22 -04:00
Bruce Momjian	f458c90bff	Add C comment about why we send cache invalidation messages for session-local objects.	2011-09-05 22:09:02 -04:00
Alvaro Herrera	56a9ed92b6	Adjust translator comment format to xgettext expectations	2011-09-05 19:04:30 -03:00
Alvaro Herrera	b64f18c583	Mark some untranslatable messages with errmsg_internal	2011-09-05 17:48:07 -03:00
Peter Eisentraut	a2a5ce6826	Improve "invalid byte sequence for encoding" message It used to say ERROR: invalid byte sequence for encoding "UTF8": 0xdb24 Change this to ERROR: invalid byte sequence for encoding "UTF8": 0xdb 0x24 to make it clear that this is a byte sequence and not a code point. Also fix the adjacent "character has no equivalent" message that has the same issue.	2011-09-05 23:38:27 +03:00
Tom Lane	4c2777d0b7	Change get_variable_numdistinct's API to flag default estimates explicitly. Formerly, callers tested for DEFAULT_NUM_DISTINCT, which had the problem that a perfectly solid estimate might be mistaken for a content-free default.	2011-09-04 15:41:49 -04:00
Tom Lane	1cb108efb0	Dig down into sub-selects to look for column statistics. If a sub-select's output column is a simple Var, recursively look for statistics applying to that Var, and use them if available. The need for this was foreseen ages ago, but we didn't have enough infrastructure to do it with reasonable speed until just now. We punt and stick with default estimates if the subquery uses set operations, GROUP BY, or DISTINCT, since those operations would change the underlying column statistics (particularly, the relative frequencies of different values) beyond recognition. This means that the types of sub-selects for which this improvement applies are fairly limited, since most subqueries satisfying those restrictions would have gotten flattened into the parent query anyway. But it does help for some cases, such as subqueries with ORDER BY or LIMIT.	2011-09-04 15:13:46 -04:00
Tom Lane	698df3350d	Can't print PlannerGlobal's subroots list in outfuncs. Since the subroots will surely link back to the same glob struct, this necessarily leads to infinite recursion. Doh. Found while trying to debug some other code.	2011-09-04 14:43:52 -04:00
Tom Lane	1609797c25	Clean up the #include mess a little. walsender.h should depend on xlog.h, not vice versa. (Actually, the inclusion was circular until a couple hours ago, which was even sillier; but Bruce broke it in the expedient rather than logically correct direction.) Because of that poor decision, plus blind application of pgrminclude, we had a situation where half the system was depending on xlog.h to include such unrelated stuff as array.h and guc.h. Clean up the header inclusion, and manually revert a lot of what pgrminclude had done so things build again. This episode reinforces my feeling that pgrminclude should not be run without adult supervision. Inclusion changes in header files in particular need to be reviewed with great care. More generally, it'd be good if we had a clearer notion of module layering to dictate which headers can sanely include which others ... but that's a big task for another day.	2011-09-04 01:13:16 -04:00
Tom Lane	b3aaf9081a	Rearrange planner to save the whole PlannerInfo (subroot) for a subquery. Formerly, set_subquery_pathlist and other creators of plans for subqueries saved only the rangetable and rowMarks lists from the lower-level PlannerInfo. But there's no reason not to remember the whole PlannerInfo, and indeed this turns out to simplify matters in a number of places. The immediate reason for doing this was so that the subroot will still be accessible when we're trying to extract column statistics out of an already-planned subquery. But now that I've done it, it seems like a good code-beautification effort in its own right. I also chose to get rid of the transient subrtable and subrowmark fields in SubqueryScan nodes, in favor of having setrefs.c look up the subquery's RelOptInfo. That required changing all the APIs in setrefs.c to pass PlannerInfo not PlannerGlobal, which was a large but quite mechanical transformation. One side-effect not foreseen at the beginning is that this finally broke inheritance_planner's assumption that replanning the same subquery RTE N times would necessarily give interchangeable results each time. That assumption was always pretty risky, but now we really have to make a separate RTE for each instance so that there's a place to carry the separate subroots.	2011-09-03 15:36:24 -04:00
Peter Eisentraut	42ad992fdc	Add archive_command example	2011-09-03 01:29:09 +03:00
Peter Eisentraut	f1e4f3d44f	Whitespace adjustment for consistency in the file	2011-09-03 01:28:05 +03:00
Tom Lane	5b562644fe	Teach ANALYZE to clear pg_class.relhassubclass when appropriate. In the past, relhassubclass always remained true if a relation had ever had child relations, even if the last subclass was long gone. While this had only marginal performance implications in most cases, it was annoying, and I'm now considering some planner changes that would raise the cost of a false positive. It was previously impractical to fix this because of race condition concerns. However, given the recent change that made tablecmds.c take ShareExclusiveLock on relations that are gaining a child (commit `fbcf4b92aa`), we can now allow ANALYZE to clear the flag when it's no longer relevant. There is no additional locking cost to do so, since ANALYZE takes ShareExclusiveLock anyway.	2011-09-02 14:29:31 -04:00
Bruce Momjian	10af3ab2b2	Add C comment about needed include.	2011-09-01 12:53:45 -04:00
Tom Lane	e5b012b788	Put back improperly removed #include.	2011-09-01 11:57:46 -04:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00
Heikki Linnakangas	a88b6e4cfb	setlocale() on Windows doesn't work correctly if the locale name contains dots. I previously worked around this in initdb, mapping the known problematic locale names to aliases that work, but Hiroshi Inoue pointed out that that's not enough because even if you use one of the aliases, like "Chinese_HKG", setlocale(LC_CTYPE, NULL) returns back the long form, ie. "Chinese_Hong Kong S.A.R.". When we try to restore an old locale value by passing that value back to setlocale(), it fails. Note that you are affected by this bug also if you use one of those short-form names manually, so just reverting the hack in initdb won't fix it. To work around that, move the locale name mapping from initdb to a wrapper around setlocale(), so that the mapping is invoked on every setlocale() call. Also, add a few checks for failed setlocale() calls in the backend. These calls shouldn't fail, and if they do there isn't much we can do about it, but at least you'll get a warning. Backpatch to 9.1, where the initdb hack was introduced. The Windows bug affects older versions too if you set locale manually to one of the aliases, but given the lack of complaints from the field, I'm hesitent to backpatch.	2011-09-01 11:08:32 +03:00
Tom Lane	0d3b231eeb	Further repair of eqjoinsel ndistinct-clamping logic. Examination of examples provided by Mark Kirkwood and others has convinced me that actually commit `7f3eba30c9` was quite a few bricks shy of a load. The useful part of that patch was clamping ndistinct for the inner side of a semi or anti join, and the reason why that's needed is that it's the only way that restriction clauses eliminating rows from the inner relation can affect the estimated size of the join result. I had not clearly understood why the clamping was appropriate, and so mis-extrapolated to conclude that we should clamp ndistinct for the outer side too, as well as for both sides of regular joins. These latter actions were all wrong, and are reverted with this patch. In addition, the clamping logic is now made to affect the behavior of both paths in eqjoinsel_semi, with or without MCV lists to compare. When we have MCVs, we suppose that the most common values are the ones that are most likely to survive the decimation resulting from a lower restriction clause, so we think of the clamping as eliminating non-MCV values, or potentially even the least-common MCVs for the inner relation. Back-patch to 8.4, same as previous fixes in this area.	2011-09-01 00:19:38 -04:00
Tom Lane	97930cf578	Improve eqjoinsel's ndistinct clamping to work for multiple levels of join. This patch fixes an oversight in my commit `7f3eba30c9` of 2008-10-23. That patch accounted for baserel restriction clauses that reduced the number of rows coming out of a table (and hence the number of possibly-distinct values of a join variable), but not for join restriction clauses that might have been applied at a lower level of join. To account for the latter, look up the sizes of the min_lefthand and min_righthand inputs of the current join, and clamp with those in the same way as for the base relations. Noted while investigating a complaint from Ben Chobot, although this in itself doesn't seem to explain his report. Back-patch to 8.4; previous versions used different estimation methods for which this heuristic isn't relevant.	2011-08-31 16:05:43 -04:00
Tom Lane	5bba65de94	Fix a missed case in code for "moving average" estimate of reltuples. It is possible for VACUUM to scan no pages at all, if the visibility map shows that all pages are all-visible. In this situation VACUUM has no new information to report about the relation's tuple density, so it wasn't changing pg_class.reltuples ... but it updated pg_class.relpages anyway. That's wrong in general, since there is no evidence to justify changing the density ratio reltuples/relpages, but it's particularly bad if the previous state was relpages=reltuples=0, which means "unknown tuple density". We just replaced "unknown" with "zero". ANALYZE would eventually recover from this, but it could take a lot of repetitions of ANALYZE to do so if the relation size is much larger than the maximum number of pages ANALYZE will scan, because of the moving-average behavior introduced by commit `b4b6923e03`. The only known situation where we could have relpages=reltuples=0 and yet the visibility map asserts everything's visible is immediately following a pg_upgrade. It might be advisable for pg_upgrade to try to preserve the relpages/reltuples statistics; but in any case this code is wrong on its own terms, so fix it. Per report from Sergey Koposov. Back-patch to 8.4, where the visibility map was introduced, same as the previous change.	2011-08-30 14:51:38 -04:00
Robert Haas	8a3d33c8e6	Fix parsing of time string followed by yesterday/today/tomorrow. Previously, 'yesterday 04:00:00'::timestamp didn't do the same thing as '04:00:00 yesterday'::timestamp, and the return value from the latter was midnight rather than the specified time. Dean Rasheed, with some stylistic changes	2011-08-30 11:38:42 -04:00
Robert Haas	eab2ef6164	Remove some tabs from README file. Some of the ASCII art expected 8-space tab stops, and some of it expected 4-space tab stops. Per report from YAMAMOTO Takashi.	2011-08-29 22:26:29 -04:00
Tom Lane	a5b7640ba0	Fix concat_ws() to not insert a separator after leading NULL argument(s). Per bug #6181 from Itagaki Takahiro. Also do some marginal code cleanup and improve error handling.	2011-08-29 15:20:57 -04:00
Robert Haas	c01c25fbe5	Improve spinlock performance for HP-UX, ia64, non-gcc. At least on this architecture, it's very important to spin on a non-atomic instruction and only retry the atomic once it appears that it will succeed. To fix this, split TAS() into two macros: TAS(), for trying to grab the lock the first time, and TAS_SPIN(), for spinning until we get it. TAS_SPIN() defaults to same as TAS(), but we can override it when we know there's a better way. It's likely that some of the other cases in s_lock.h require similar treatment, but this is the only one we've got conclusive evidence for at present.	2011-08-29 10:05:48 -04:00
Bruce Momjian	4bd7333b14	Allow more include files to be compiled in their own by adding missing include dependencies. Modify pgcompinclude to skip a common fcinfo error.	2011-08-27 11:05:33 -04:00
Peter Eisentraut	fd5b397ca4	Implement the information schema with_hierarchy column In PostgreSQL, this is included in the SELECT privilege, so show YES or NO depending on whether SELECT is granted.	2011-08-27 15:03:02 +03:00
Bruce Momjian	f261deb4b4	Add missing includes after pgrminclude run.	2011-08-26 18:15:14 -04:00
Bruce Momjian	f8fc37b337	Add markers for skips.	2011-08-26 18:15:13 -04:00
Tom Lane	00eb036c11	Fix potential memory clobber in tsvector_concat(). tsvector_concat() allocated its result workspace using the "conservative" estimate of the sum of the two input tsvectors' sizes. Unfortunately that wasn't so conservative as all that, because it supposed that the number of pad bytes required could not grow. Which it can, as per test case from Jesper Krogh, if there's a mix of lexemes with positions and lexemes without them in the input data. The fix is to assume that we might add a not-previously-present pad byte for each and every lexeme in the two inputs; which really is conservative, but it doesn't seem worthwhile to try to be more precise. This is an aboriginal bug in tsvector_concat, so back-patch to all versions containing it.	2011-08-26 16:51:34 -04:00
Tom Lane	ecf248737a	Add makefile rules to check for backtracking in backend and psql lexers. Per discussion, we should enforce the policy of "no backtracking" in these performance-sensitive scanners.	2011-08-25 14:44:17 -04:00
Tom Lane	2e95f1f002	Add "%option warn" to all flex input files that lacked it. This is recommended in the flex manual, and there seems no good reason not to use it everywhere.	2011-08-25 13:55:57 -04:00
Robert Haas	48bc57657d	Tweak postgresql.conf.sample's comments on listen_addresess. This makes it slightly more clear that '*' is not part of the default value, in case that wasn't obvious. As requested by Dougal Sutherland.	2011-08-25 09:41:24 -04:00
Tom Lane	cb5c2ba2d8	Fix multiple bugs in extension dropping. When we implemented extensions, we made findDependentObjects() treat EXTENSION dependency links similarly to INTERNAL links. However, that logic contained an implicit assumption that an object could have at most one INTERNAL dependency, so it did not work correctly for objects having both INTERNAL and DEPENDENCY links. This led to failure to drop some extension member objects when dropping the extension. Furthermore, we'd never actually exercised the case of recursing to an internally-referenced (owning) object from anything other than a NORMAL dependency, and it turns out that passing the incoming dependency's flags to the owning object is the Wrong Thing. This led to sometimes dropping a whole extension silently when we should have rejected the drop command for lack of CASCADE. Since we obviously were under-testing extension drop scenarios, add some regression test cases. Unfortunately, such test cases require some extensions (duh), so we can't test for problems in the core regression tests. I chose to add them to the earthdistance contrib module, which is a good test case because it has a dependency on the cube contrib module. Back-patch to 9.1. Arguably these are pre-existing bugs in INTERNAL dependency handling, but since it appears that the cases can never arise pre-9.1, I'll refrain from back-patching the logic changes further than that.	2011-08-24 13:09:06 -04:00
Tom Lane	d4aa491493	Make CREATE EXTENSION check schema creation permissions. When creating a new schema for a non-relocatable extension, we neglected to check whether the calling user has permission to create schemas. That didn't matter in the original coding, since we had already checked superuserness, but in the new dispensation where users need not be superusers, we should check it. Use CreateSchemaCommand() rather than calling NamespaceCreate() directly, so that we also enforce the rules about reserved schema names. Per complaint from KaiGai Kohei, though this isn't the same as his patch.	2011-08-23 21:49:07 -04:00
Tom Lane	43f0c20839	Fix overoptimistic assumptions in column width estimation for subqueries. set_append_rel_pathlist supposed that, while computing per-column width estimates for the appendrel, it could ignore child rels for which the translated reltargetlist entry wasn't a Var. This gave rise to completely silly estimates in some common cases, such as constant outputs from some or all of the arms of a UNION ALL. Instead, fall back on get_typavgwidth to estimate from the value's datatype; which might be a poor estimate but at least it's not completely wacko. That problem was exposed by an Assert in set_subquery_size_estimates, which unfortunately was still overoptimistic even with that fix, since we don't compute attr_widths estimates for appendrels that are entirely excluded by constraints. So remove the Assert; we'll just fall back on get_typavgwidth in such cases. Also, since set_subquery_size_estimates calls set_baserel_size_estimates which calls set_rel_width, there's no need for set_subquery_size_estimates to call get_typavgwidth; set_rel_width will handle it for us if we just leave the estimate set to zero. Remove the unnecessary code. Per report from Erik Rijkers and subsequent investigation.	2011-08-23 17:13:12 -04:00
Peter Eisentraut	1af55e2751	Use consistent format for reporting GetLastError() Use something like "error code %lu" for reporting GetLastError() values on Windows. Previously, a mix of different wordings and formats were in use.	2011-08-23 22:00:52 +03:00
Robert Haas	7488936478	Typo fix.	2011-08-22 12:16:27 -04:00
Tom Lane	660a081c3f	Fix handling of extension membership when filling in a shell operator. The previous coding would result in deleting and not re-creating the extension membership pg_depend rows, since there was no CommandCounterIncrement that would allow recordDependencyOnCurrentExtension to see that the deletion had happened. Make it work like the shell type case, ie, keep the existing entries (and then throw an error if they're for the wrong extension). Per bug #6172 from Hitoshi Harada. Investigation and fix by Dimitri Fontaine.	2011-08-22 10:55:47 -04:00
Tom Lane	b33f78df17	Fix trigger WHEN conditions when both BEFORE and AFTER triggers exist. Due to tuple-slot mismanagement, evaluation of WHEN conditions for AFTER ROW UPDATE triggers could crash if there had been a BEFORE ROW trigger fired for the same update. Fix by not trying to overload the use of estate->es_trig_tuple_slot. Per report from Yoran Heling. Back-patch to 9.0, when trigger WHEN conditions were introduced.	2011-08-21 18:15:55 -04:00
Tom Lane	08e1eedf24	Fix performance problem when building a lossy tidbitmap. As pointed out by Sergey Koposov, repeated invocations of tbm_lossify can make building a large tidbitmap into an O(N^2) operation. To fix, make sure we remove more than the minimum amount of information per call, and add a fallback path to behave sanely if we're unable to fit the bitmap within the requested amount of memory. This has been wrong since the tidbitmap code was written, so back-patch to all supported branches.	2011-08-20 14:51:02 -04:00
Robert Haas	0f7acbeddf	Make lazy_vacuum_rel call pg_rusage_init only if needed. do_analyze_rel already does it this way. Euler Taveira de Oliveira	2011-08-18 09:55:04 -04:00
Robert Haas	24bf1552f6	Remove obsolete README file. Perhaps we ought to add some other kind of documentation here instead, but for now let's get rid of this woefully obsolete description of the sinval machinery.	2011-08-18 09:49:41 -04:00
Peter Eisentraut	1bf80041e3	Translation updates	2011-08-17 14:07:46 +03:00
Heikki Linnakangas	1d0392b245	Fix comment about which version had BACKUP METHOD line in backup_lable, again. It was invalidated again by Fujii's patch to 9.1.	2011-08-17 12:31:23 +03:00
Tom Lane	b5282aa893	Revise sinval code to remove no-longer-used tuple TID from inval messages. This requires adjusting the API for syscache callback functions: they now get a hash value, not a TID, to identify the target tuple. Most of them weren't paying any attention to that argument anyway, but plancache did require a small amount of fixing. Also, improve performance a trifle by avoiding sending duplicate inval messages when a heap_update isn't changing the catcache lookup columns.	2011-08-16 19:27:46 -04:00
Tom Lane	632ae6829f	Forget about targeting catalog cache invalidations by tuple TID. The TID isn't stable enough: we might queue an sinval event before a VACUUM FULL, and then process it afterwards, when the target tuple no longer has the same TID. So we must invalidate entries on the basis of hash value only. The old coding can be shown to result in various bizarre, hard-to-reproduce errors in the presence of concurrent VACUUM FULLs on system catalogs, and could easily result in permanent catalog corruption, up to and including complete loss of tables. This commit is just a minimal fix that removes the unsafe comparison. We should remove transmission of the tuple TID from sinval messages altogether, and then arrange to suppress the extra message in the common case of a heap_update that doesn't change the key hashvalue. But that's going to be much more invasive, and will only produce a probably-marginal performance gain, so it doesn't seem like material for a back-patch. Back-patch to 9.0. Before that, VACUUM FULL refused to do any tuple moving if it found any INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples (and CLUSTER would give up altogether), so there was no risk of moving a tuple that might be the subject of an unsent sinval message.	2011-08-16 15:26:22 -04:00
Tom Lane	f4d7f1adba	Fix incorrect order of operations during sinval reset processing. We have to be sure that we have revalidated each nailed-in-cache relcache entry before we try to use it to load data for some other relcache entry. The introduction of "mapped relations" in 9.0 broke this, because although we updated the state kept in relmapper.c early enough, we failed to propagate that information into relcache entries soon enough; in particular, we could try to fetch pg_class rows out of pg_class before we'd updated its relcache entry's rd_node.relNode value from the map. This bug accounts for Dave Gould's report of failures after "vacuum full pg_class", and I believe that there is risk for other system catalogs as well. The core part of the fix is to copy relmapper data into the relcache entries during "phase 1" in RelationCacheInvalidate(), before they'll be used in "phase 2". To try to future-proof the code against other similar bugs, I also rearranged the order in which nailed relations are visited during phase 2: now it's pg_class first, then pg_class_oid_index, then other nailed relations. This should ensure that RelationClearRelation can apply RelationReloadIndexInfo to all nailed indexes without risking use of not-yet-revalidated relcache entries. Back-patch to 9.0 where the relation mapper was introduced.	2011-08-16 14:38:20 -04:00
Tom Lane	7b0d0e9356	Preserve toast value OIDs in toast-swap-by-content for CLUSTER/VACUUM FULL. This works around the problem that a catalog cache entry might contain a toast pointer that we try to dereference just as a VACUUM FULL completes on that catalog. We will see the sinval message on the cache entry when we acquire lock on the toast table, but by that point we've already told tuptoaster.c "here's the pointer to fetch", so it's difficult from a code structural standpoint to update the pointer before we use it. Much less painful to ensure that toast pointers are not invalidated in the first place. We have to add a bit of code to deal with the case that a value that previously wasn't toasted becomes so; but that should be a seldom-exercised corner case, so the inefficiency shouldn't be significant. Back-patch to 9.0. In prior versions, we didn't allow CLUSTER on system catalogs, and VACUUM FULL didn't result in reassignment of toast OIDs, so there was no problem.	2011-08-16 13:48:04 -04:00
Tom Lane	2ada6779c5	Fix race condition in relcache init file invalidation. The previous code tried to synchronize by unlinking the init file twice, but that doesn't actually work: it leaves a window wherein a third process could read the already-stale init file but miss the SI messages that would tell it the data is stale. The result would be bizarre failures in catalog accesses, typically "could not read block 0 in file ..." later during startup. Instead, hold RelCacheInitLock across both the unlink and the sending of the SI messages. This is more straightforward, and might even be a bit faster since only one unlink call is needed. This has been wrong since it was put in (in 2002!), so back-patch to all supported releases.	2011-08-16 13:11:54 -04:00
Heikki Linnakangas	2877c67bc2	Fix bogus comment that claimed that the new BACKUP METHOD line in backup_label was new in 9.0. Spotted by Fujii Masao.	2011-08-16 12:23:51 +03:00
Peter Eisentraut	e5475a80d2	Add "Reason code" prefix to internal SSI error messages This makes it clearer that the error message is perhaps not supposed to be understood by users, and it also makes it somewhat clearer that it was not accidentally omitted from translation. Idea from Heikki Linnakangas, except that we don't mark "Reason code" for translation at this point, because that would make the implementation too cumbersome.	2011-08-15 15:20:16 +03:00
Tom Lane	52994e9e56	Fix unsafe order of operations in foreign-table DDL commands. When updating or deleting a system catalog tuple, it's necessary to acquire RowExclusiveLock on the catalog before looking up the tuple; otherwise a concurrent VACUUM FULL on the catalog might move the tuple to a different TID before we can apply the update. Coding patterns that find the tuple via a table scan aren't at risk here, but when obtaining the tuple from a catalog cache, correct ordering is important; and several routines in foreigncmds.c got it wrong. Noted while running the regression tests in parallel with VACUUM FULL of assorted system catalogs. For consistency I moved all the heap_open calls to the starts of their functions, including a couple for which there was no actual bug. Back-patch to 8.4 where foreigncmds.c was added.	2011-08-14 15:40:21 -04:00
Tom Lane	592b615d71	Fix incorrect timeout handling during initial authentication transaction. The statement start timestamp was not set before initiating the transaction that is used to look up client authentication information in pg_authid. In consequence, enable_sig_alarm computed a wrong value (far in the past) for statement_fin_time. That didn't have any immediate effect, because the timeout alarm was set without reference to statement_fin_time; but if we subsequently blocked on a lock for a short time, CheckStatementTimeout would consult the bogus value when we cancelled the lock timeout wait, and then conclude we'd timed out, leading to immediate failure of the connection attempt. Thus an innocent "vacuum full pg_authid" would cause failures of concurrent connection attempts. Noted while testing other, more serious consequences of vacuum full on system catalogs. We should set the statement timestamp before StartTransactionCommand(), so that the transaction start timestamp is also valid. I'm not sure if there are any non-cosmetic effects of it not being valid, but the xact timestamp is at least sent to the statistics machinery. Back-patch to 9.0. Before that, the client authentication timeout was done outside any transaction and did not depend on this state to be valid.	2011-08-13 17:52:24 -04:00
Tom Lane	a180776f7a	Teach unix_latch.c to use poll() where available. poll() is preferred over select() on platforms where both are available, because it tends to be a bit faster and it doesn't have an arbitrary limit on the range of FD numbers that can be accessed. The FD range limit does not appear to be a risk factor for any 9.1 usages, so this doesn't need to be back-patched, but we need to have it in place if we keep on expanding the uses of WaitLatch.	2011-08-11 12:50:22 -04:00
Robert Haas	5057366eed	Unbreak legacy syntax "COMMENT ON RULE x IS y", with no relation name. check_object_ownership() isn't happy about the null relation pointer. We could fix it there, but this seems more future-proof.	2011-08-11 11:23:51 -04:00
Tom Lane	cff75130b5	Remove wal_sender_delay GUC, because it's no longer useful. The latch infrastructure is now capable of detecting all cases where the walsender loop needs to wake up, so there is no reason to have an arbitrary timeout. Also, modify the walsender loop logic to follow the standard pattern of ResetLatch, test for work to do, WaitLatch. The previous coding was both hard to follow and buggy: it would sometimes busy-loop despite having nothing available to do, eg between receipt of a signal and the next time it was caught up with new WAL, and it also had interesting choices like deciding to update to WALSNDSTATE_STREAMING on the strength of information known to be obsolete.	2011-08-10 18:50:28 -04:00
Tom Lane	79b2ee20c8	Add a bit of debug logging to backend_read_statsfile(). This is in hopes of learning more about what causes "pgstat wait timeout" warnings in the buildfarm. This patch should probably be reverted once we've learned what we can. As coded, it will result in regression test "failures" at half the delay that the existing code does, so I expect to see a few more than before.	2011-08-10 16:45:43 -04:00
Tom Lane	4dab3d5ae1	Change the autovacuum launcher to use WaitLatch instead of a poll loop. In pursuit of this (and with the expectation that WaitLatch will be needed in more places), convert the latch field that was already added to PGPROC for sync rep into a generic latch that is activated for all PGPROC-owning processes, and change many of the standard backend signal handlers to set that latch when a signal happens. This will allow WaitLatch callers to be wakened properly by these signals. In passing, fix a whole bunch of signal handlers that had been hacked to do things that might change errno, without adding the necessary save/restore logic for errno. Also make some minor fixes in unix_latch.c, and clean up bizarre and unsafe scheme for disowning the process's latch. Much of this has to be back-patched into 9.1. Peter Geoghegan, with additional work by Tom	2011-08-10 12:22:21 -04:00
Heikki Linnakangas	41f9ffd928	If backup-end record is not seen, and we reach end of recovery from a streamed backup, throw an error and refuse to start up. The restore has not finished correctly in that case and the data directory is possibly corrupt. We already errored out in case of archive recovery, but could not during crash recovery because we couldn't distinguish between the case that pg_start_backup() was called and the database then crashed (must not error, data is OK), and the case that we're restoring from a backup and not all the needed WAL was replayed (data can be corrupt). To distinguish those cases, add a line to backup_label to indicate whether the backup was taken with pg_start/stop_backup(), or by streaming (ie. pg_basebackup). This requires re-initdb, because of a new field added to the control file.	2011-08-10 09:22:49 +03:00
Tom Lane	9f17ffd866	Measure WaitLatch's timeout parameter in milliseconds, not microseconds. The original definition had the problem that timeouts exceeding about 2100 seconds couldn't be specified on 32-bit machines. Milliseconds seem like sufficient resolution, and finer grain than that would be fantasy anyway on many platforms. Back-patch to 9.1 so that this aspect of the latch API won't change between 9.1 and later releases. Peter Geoghegan	2011-08-09 18:52:29 -04:00
Tom Lane	4e15a4db5e	Documentation improvement and minor code cleanups for the latch facility. Improve the documentation around weak-memory-ordering risks, and do a pass of general editorialization on the comments in the latch code. Make the Windows latch code more like the Unix latch code where feasible; in particular provide the same Assert checks in both implementations. Fix poorly-placed WaitLatch call in syncrep.c. This patch resolves, for the moment, concerns around weak-memory-ordering bugs in latch-related code: we have documented the restrictions and checked that existing calls meet them. In 9.2 I hope that we will install suitable memory barrier instructions in SetLatch/ResetLatch, so that their callers don't need to be quite so careful.	2011-08-09 15:30:45 -04:00
Tom Lane	cff60f2dfa	Avoid creating PlaceHolderVars immediately within PlaceHolderVars. Such a construction is useless since the lower PlaceHolderVar is already nullable; no need to make it more so. Noted while pursuing bug #6154. This is just a minor planner efficiency improvement, since the final plan will come out the same anyway after PHVs are flattened. So not worth the risk of back-patching.	2011-08-09 11:34:20 -04:00
Peter Eisentraut	f4a9da0a15	Use clearer notation for getnameinfo() return handling Writing if (getnameinfo(...)) handle_error(); reads quite strangely, so use something like if (getnameinfo(...) != 0) handle_error(); instead.	2011-08-09 18:30:32 +03:00
Heikki Linnakangas	77949a2913	Change the way string relopts are allocated. Don't try to allocate the default value for a string relopt in the same palloc chunk as the relopt_string struct. That didn't work too well if you added a built-in string relopt in the stringRelOpts array, as it's not possible to have an initializer for a variable length struct in C. This makes the code slightly simpler too. While we're at it, move the call to validator function in add_string_reloption to before the allocation, so that if someone does pass a bogus default value, we don't leak memory.	2011-08-09 15:25:44 +03:00
Heikki Linnakangas	5b6c8436d7	Fix grammar and spelling in log message.	2011-08-09 11:45:25 +03:00
Tom Lane	77ba232564	Fix nested PlaceHolderVar expressions that appear only in targetlists. A PlaceHolderVar's expression might contain another, lower-level PlaceHolderVar. If the outer PlaceHolderVar is used, the inner one certainly will be also, and so we have to make sure that both of them get into the placeholder_list with correct ph_may_need values during the initial pre-scan of the query (before deconstruct_jointree starts). We did this correctly for PlaceHolderVars appearing in the query quals, but overlooked the issue for those appearing in the top-level targetlist; with the result that nested placeholders referenced only in the targetlist did not work correctly, as illustrated in bug #6154. While at it, add some error checking to find_placeholder_info to ensure that we don't try to create new placeholders after it's too late to do so; they have to all be created before deconstruct_jointree starts. Back-patch to 8.4 where the PlaceHolderVar mechanism was introduced.	2011-08-09 00:50:07 -04:00
Tom Lane	05e8396892	Clean up ill-advised attempt to invent a private set of Node tags. Somebody thought it'd be cute to invent a set of Node tag numbers that were defined independently of, and indeed conflicting with, the main tag-number list. While this accidentally failed to fail so far, it would certainly lead to trouble as soon as anyone wanted to, say, apply copyObject to these node types. Clang was already complaining about the use of makeNode on these tags, and I think quite rightly so. Fix by pushing these node definitions into the mainstream, including putting replnodes.h where it belongs.	2011-08-06 14:53:49 -04:00
Tom Lane	375aa7b393	Reduce PG_SYSLOG_LIMIT to 900 bytes. The previous limit of 1024 was set on the assumption that all modern syslog implementations have line length limits of 2KB or so. However, this is false, as at least Solaris and sysklogd truncate at only 1KB. 900 seems to leave enough room for the max likely length of the tacked-on prefixes, so let's go with that. As with the previous change, it doesn't seem wise to back-patch this into already-released branches; but it should be OK to sneak it into 9.1. Noah Misch	2011-08-05 21:02:31 -04:00
Robert Haas	c4096c7639	Allow per-column foreign data wrapper options. Shigeru Hanada, with fairly minor editing by me.	2011-08-05 13:24:03 -04:00
Robert Haas	84e3712677	Create VXID locks "lazily" in the main lock table. Instead of entering them on transaction startup, we materialize them only when someone wants to wait, which will occur only during CREATE INDEX CONCURRENTLY. In Hot Standby mode, the startup process must also be able to probe for conflicting VXID locks, but the lock need never be fully materialized, because the startup process does not use the normal lock wait mechanism. Since most VXID locks never need to touch the lock manager partition locks, this can significantly reduce blocking contention on read-heavy workloads. Patch by me. Review by Jeff Davis.	2011-08-04 12:38:33 -04:00
Robert Haas	4af43ee3f1	Make pgbench use erand48() rather than random(). glibc renders random() thread-safe by wrapping a futex lock around it; testing reveals that this limits the performance of pgbench on machines with many CPU cores. Rather than switching to random_r(), which is only available on GNU systems and crashes unless you use undocumented alchemy to initialize the random state properly, switch to our built-in implementation of erand48(), which is both thread-safe and concurrent. Since the list of reasons not to use the operating system's erand48() is getting rather long, rename ours to pg_erand48() (and similarly for our implementations of lrand48() and srand48()) and just always use those. We were already doing this on Cygwin anyway, and the glibc implementation is not quite thread-safe, so pgbench wouldn't be able to use that either. Per discussion with Tom Lane.	2011-08-03 16:26:40 -04:00
Tom Lane	ac36e6f71f	Move CheckRecoveryConflictDeadlock() call to a safer place. This kluge was inserted in a spot apparently chosen at random: the lock manager's state is not yet fully set up for the wait, and in particular LockWaitCancel hasn't been armed by setting lockAwaited, so the ProcLock will not get cleaned up if the ereport is thrown. This seems to not cause any observable problem in trivial test cases, because LockReleaseAll will silently clean up the debris; but I was able to cause failures with tests involving subtransactions. Fixes breakage induced by commit `c85c941470`. Back-patch to all affected branches.	2011-08-02 15:16:29 -04:00
Tom Lane	2e53bd5517	Fix incorrect initialization of ProcGlobal->startupBufferPinWaitBufId. It was initialized in the wrong place and to the wrong value. With bad luck this could result in incorrect query-cancellation failures in hot standby sessions, should a HS backend be holding pin on buffer number 1 while trying to acquire a lock.	2011-08-02 13:23:52 -04:00
Heikki Linnakangas	89df948ec2	Avoid integer overflow when LIMIT + OFFSET >= 2^63. This fixes bug #6139 reported by Hitoshi Harada.	2011-08-02 10:47:17 +03:00
Robert Haas	85b436f7b1	Minor stylistic corrections.	2011-08-01 08:24:45 -04:00
Peter Eisentraut	8a0fa9cad9	Add host name resolution information to pg_hba.conf error messages This is to be able to analyze issues with host names in pg_hba.conf.	2011-07-31 18:03:43 +03:00
Robert Haas	b4fbe392f8	Reduce sinval synchronization overhead. Testing shows that the overhead of acquiring and releasing SInvalReadLock and msgNumLock on high-core count boxes can waste a lot of CPU time and hurt performance. This patch adds a per-backend flag that allows us to skip all that locking in most cases. Further testing shows that this improves performance even when sinval traffic is very high. Patch by me. Review and testing by Noah Misch.	2011-07-29 16:46:13 -04:00
Peter Eisentraut	0fe8150827	Minor message style adjustment	2011-07-27 23:54:46 +03:00
Tom Lane	c1420fcf7d	Check to see whether libxml2 handles error context the way we expect. It turns out to be possible to link against a libxml2.so that does this differently than the version we configured and built against, so we need a runtime check to avoid bizarre behavior. Per report from Bernd Helmle. Patch by Florian Pflug.	2011-07-26 16:31:04 -04:00
Peter Eisentraut	ce8d7bb644	Replace printf format %i by %d They are identical, but the overwhelming majority of the code uses %d, so standardize on that.	2011-07-26 22:54:29 +03:00
Andrew Dunstan	74e6d37276	Silence compiler warning about uninitialized variable. It is set correctly on the only path that uses it, but the compiler can't know that.	2011-07-25 19:37:17 -04:00
Tom Lane	d0c23026b2	Use OpenSSL's SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER flag. This disables an entirely unnecessary "sanity check" that causes failures in nonblocking mode, because OpenSSL complains if we move or compact the write buffer. The only actual requirement is that we not modify pending data once we've attempted to send it, which we don't. Per testing and research by Martin Pihlak, though this fix is a lot simpler than his patch. I put the same change into the backend, although it's less clear whether it's necessary there. We do use nonblock mode in some situations in streaming replication, so seems best to keep the same behavior in the backend as in libpq. Back-patch to all supported releases.	2011-07-24 15:17:51 -04:00
Tom Lane	988cccc620	Rethink behavior of CREATE OR REPLACE during CREATE EXTENSION. The original implementation simply did nothing when replacing an existing object during CREATE EXTENSION. The folly of this was exposed by a report from Marc Munro: if the existing object belongs to another extension, we are left in an inconsistent state. We should insist that the object does not belong to another extension, and then add it to the current extension if not already a member.	2011-07-23 16:59:39 -04:00
Robert Haas	6f1be5a67a	Unbreak unlogged tables. I broke this in commit `5da79169d3`, which was obviously insufficiently well tested. Add some regression tests in the hope of making future slip-ups more likely to be noticed.	2011-07-22 16:15:43 -04:00
Tom Lane	0ce7676aa0	Make xpath() do something useful with XPath expressions that return scalars. Previously, xpath() simply returned an empty array if the expression did not yield a node set. This is useless for expressions that return scalars, such as one with name() at the top level. Arrange to return the scalar value as a single-element xml array, instead. (String values will be suitably escaped.) This change will also cause xpath_exists() to return true, not false, for such expressions. Florian Pflug, reviewed by Radoslaw Smogura	2011-07-21 11:32:46 -04:00
Tom Lane	aaf15e5c1c	Ensure that xpath() escapes special characters in string values. Without this it's possible for the output to not be legal XML, as illustrated by the added regression test cases. NB: this change will need to be called out as an incompatibility in the 9.2 release notes, since it's possible somebody was relying on the old behavior, even though it's clearly wrong. Florian Pflug, reviewed by Radoslaw Smogura	2011-07-20 18:44:35 -04:00
Robert Haas	463f2625a5	Support SECURITY LABEL on databases, tablespaces, and roles. This requires a new shared catalog, pg_shseclabel. Along the way, fix the security_label regression tests so that they don't monkey with the labels of any pre-existing objects. This is unlikely to matter in practice, since only the label for the "dummy" provider was being manipulated. But this way still seems cleaner. KaiGai Kohei, with fairly extensive hacking by me.	2011-07-20 13:18:24 -04:00
Tom Lane	cacd42d62c	Rewrite libxml error handling to be more robust. libxml reports some errors (like invalid xmlns attributes) via the error handler hook, but still returns a success indicator to the library caller. This causes us to miss some errors that are important to report. Since the "generic" error handler hook doesn't know whether the message it's getting is for an error, warning, or notice, stop using that and instead start using the "structured" error handler hook, which gets enough information to be useful. While at it, arrange to save and restore the error handler hook setting in each libxml-using function, rather than assuming we can set and forget the hook. This should improve the odds of working nicely with third-party libraries that also use libxml. In passing, volatile-ize some local variables that get modified within PG_TRY blocks. I noticed this while testing with an older gcc version than I'd previously tried to compile xml.c with. Florian Pflug and Tom Lane, with extensive review/testing by Noah Misch	2011-07-20 13:03:49 -04:00
Simon Riggs	7cb7122800	Remove O(N^2) performance issue with multiple SAVEPOINTs. Subtransaction locks now released en masse at main commit, rather than repeatedly re-scanning for locks as we ascend the nested transaction tree. Split transaction state TBLOCK_SUBEND into two states, TBLOCK_SUBCOMMIT and TBLOCK_SUBRELEASE to allow the commit path to be optimised using the existing code in ResourceOwnerRelease() which appears to have been intended for this usage, judging from comments therein.	2011-07-19 17:21:24 +01:00
Robert Haas	8e5ac74c12	Some refinement for the "fast path" lock patch. 1. In GetLockStatusData, avoid initializing instance before we've ensured that the array is large enough. Otherwise, if repalloc moves the block around, we're hosed. 2. Add the word "Relation" to the name of some identifiers, to avoid assuming that the fast-path mechanism will only ever apply to relations (though these particular parts certainly will). Some of the macros could possibly use similar treatment, but the names are getting awfully long already. 3. Add a missing word to comment in AtPrepare_Locks().	2011-07-19 12:10:15 -04:00
Robert Haas	cdd61237d6	Remove superfluous variable. Reported by Peter Eisentraut.	2011-07-19 10:30:26 -04:00
Simon Riggs	4bd8ed31b7	Introduce sending servers as new category for replication params Fujii Masao	2011-07-19 08:59:55 +01:00
Peter Eisentraut	30f854537d	Change debug message from ereport to elog	2011-07-19 07:50:10 +03:00
Simon Riggs	5286105800	Cascading replication feature for streaming log-based replication. Standby servers can now have WALSender processes, which can work with either WALReceiver or archive_commands to pass data. Fully updated docs, including new conceptual terms of sending server, upstream and downstream servers. WALSenders terminated when promote to master. Fujii Masao, review, rework and doc rewrite by Simon Riggs	2011-07-19 03:40:03 +01:00
Tom Lane	3d4890c0c5	Add GET STACKED DIAGNOSTICS plpgsql command to retrieve exception info. This is more SQL-spec-compliant, more easily extensible, and better performing than the old method of inventing special variables. Pavel Stehule, reviewed by Shigeru Hanada and David Wheeler	2011-07-18 14:47:18 -04:00
Robert Haas	367bc426a1	Avoid index rebuild for no-rewrite ALTER TABLE .. ALTER TYPE. Noah Misch. Review and minor cosmetic changes by me.	2011-07-18 11:04:43 -04:00
Robert Haas	3cba8999b3	Create a "fast path" for acquiring weak relation locks. When an AccessShareLock, RowShareLock, or RowExclusiveLock is requested on an unshared database relation, and we can verify that no conflicting locks can possibly be present, record the lock in a per-backend queue, stored within the PGPROC, rather than in the primary lock table. This eliminates a great deal of contention on the lock manager LWLocks. This patch also refactors the interface between GetLockStatusData() and pg_lock_status() to be a bit more abstract, so that we don't rely so heavily on the lock manager's internal representation details. The new fast path lock structures don't have a LOCK or PROCLOCK structure to return, so we mustn't depend on that for purposes of listing outstanding locks. Review by Jeff Davis.	2011-07-18 00:49:28 -04:00
Robert Haas	b59d2fe497	Add pg_opfamily_is_visible. We already have similar functions for many other object types, including operator classes, so it seems like we should have this one, too. Extracted from a larger patch by Josh Kupershmidt	2011-07-17 23:23:55 -04:00
Tom Lane	9473bb96d0	Further thoughts about temp_file_limit patch. Move FileClose's decrement of temporary_files_size up, so that it will be executed even if elog() throws an error. This is reasonable since if the unlink() fails, the fact the file is still there is not our fault, and we are going to forget about it anyhow. So we won't count it against temp_file_limit anymore. Update fileSize and temporary_files_size correctly in FileTruncate. We probably don't have any places that truncate temp files, but fd.c surely should not assume that.	2011-07-17 15:05:44 -04:00
Tom Lane	23e5b16c71	Add temp_file_limit GUC parameter to constrain temporary file space usage. The limit is enforced against the total amount of temp file space used by each session. Mark Kirkwood, reviewed by Cédric Villemain and Tatsuo Ishii	2011-07-17 14:19:31 -04:00
Tom Lane	1bc16a9460	Improve make_subplanTargetList to avoid including Vars unnecessarily. If a Var was used only in a GROUP BY expression, the previous implementation would include the Var by itself (as well as the expression) in the generated targetlist. This wouldn't affect the efficiency of the scan/join part of the plan at all, but it could result in passing unnecessarily-wide rows through sorting and grouping steps. It turns out to take only a little more code, and not noticeably more time, to generate a tlist without such redundancy, so let's do that. Per a recent gripe from HarmeekSingh Bedi.	2011-07-16 16:46:55 -04:00
Tom Lane	1af37ec96d	Replace errdetail("%s", ...) with errdetail_internal("%s", ...). There may be some other places where we should use errdetail_internal, but they'll have to be evaluated case-by-case. This commit just hits a bunch of places where invoking gettext is obviously a waste of cycles.	2011-07-16 14:22:18 -04:00
Tom Lane	3ee7c8710d	Use errdetail_internal() for SSI transaction cancellation details. Per discussion, these seem too technical to be worth translating. Kevin Grittner	2011-07-16 14:22:16 -04:00
Tom Lane	ed7ed76712	Add an errdetail_internal() ereport auxiliary routine. This function supports untranslated detail messages, in the same way that errmsg_internal supports untranslated primary messages. We've needed this for some time IMO, but discussion of some cases in the SSI code provided the impetus to actually add it. Kevin Grittner, with minor adjustments by me	2011-07-16 14:22:15 -04:00
Magnus Hagander	0886dde5f8	Fix SSPI login when multiple roundtrips are required This fixes SSPI login failures showing "The function requested is not supported", often showing up when connecting to localhost. The reason was not properly updating the SSPI handle when multiple roundtrips were required to complete the authentication sequence. Report and analysis by Ahmed Shinwari, patch by Magnus Hagander	2011-07-16 19:58:53 +02:00
Peter Eisentraut	bf3c585681	Set information_schema.tables.commit_action to null The commit action of temporary tables is currently not cataloged, so we can't easily show it. The previous value was outdated from before we had different commit actions.	2011-07-15 21:11:14 +03:00
Heikki Linnakangas	8d260911e8	Change the way the offset of downlink is stored in GISTInsertStack. GISTInsertStack.childoffnum used to mean "offset of the downlink in this node, pointing to the child node in the stack". It's now replaced with downlinkoffnum, which means "offset of the downlink in the parent of this node". gistFindPath() already used childoffnum with this new meaning, and had an extra step at the end to pull all the childoffnum values down one node in the stack, to adjust the stack for the meaning that childoffnum had elsewhere. That's no longer required. The reason to do this now is this new representation is more convenient for the GiST fast build patch that Alexander Korotkov is working on. While we're at it, replace the linked list used in gistFindPath with a standard List, and make gistFindPath() static. Alexander Korotkov, with some changes by me.	2011-07-15 12:18:30 +03:00
Heikki Linnakangas	bc175eb805	Fix two ancient bugs in GiST code to re-find a parent after page split: First, when following a right-link, we incorrectly marked the current page as the parent of the right sibling. In reality, the parent of the right page is the same as the parent of the current page (or some page to the right of it, gistFindCorrectParent() will sort that out). Secondly, when we follow a right-link, we must prepend, not append, the right page to our list of pages to visit. That's because we assume that once we hit a leaf page in the list, all the rest are leaf pages too, and give up. To hit these bugs, you need concurrent actions and several unlucky accidents. Another backend must split the root page, while you're in process of splitting a lower-level page. Furthermore, while you scan the internal nodes to re-find the parent, another backend needs to again split some more internal pages. Even then, the bugs don't necessarily manifest as user-visible errors or index corruption. While we're at it, make the error reporting a bit better if gistFindPath() fails to re-find the parent. It used to be an assertion, but an elog() seems more appropriate. Backpatch to all supported branches.	2011-07-15 11:05:12 +03:00
Tom Lane	f3ff0433ab	In planner, don't assume that empty parent tables aren't really empty. There's a heuristic in estimate_rel_size() to clamp the minimum size estimate for a table to 10 pages, unless we can see that vacuum or analyze has been run (and set relpages to something nonzero, so this will always happen for a table that's actually empty). However, it would be better not to do this for inheritance parent tables, which very commonly are really empty and can be expected to stay that way. Per discussion of a recent pgsql-performance report from Anish Kejariwal. Also prevent it from happening for indexes (although this is more in the nature of documentation, since CREATE INDEX normally initializes relpages to something nonzero anyway). Back-patch to 9.0, because the ability to collect statistics across a whole inheritance tree has improved the planner's estimates to the point where this relatively small error makes a significant difference. In the referenced report, merge or hash joins were incorrectly estimated as cheaper than a nestloop with inner indexscan on the inherited table. That was less likely before 9.0 because the lack of inherited stats would have resulted in a default (and rather pessimistic) estimate of the cost of a merge or hash join.	2011-07-14 17:30:57 -04:00
Peter Eisentraut	f4678c205a	Set information_schema.routines.is_udt_dependent to NO It previously said YES, but that is incorrect.	2011-07-14 19:18:17 +03:00
Tom Lane	96f990e23b	Update some comments to clarify who does what in targetlist creation. No code changes; just avoid blaming query_planner for things it doesn't really do.	2011-07-13 20:23:09 -04:00
Peter Eisentraut	0527a454ec	Implement information schema interval_type columns Also correct reporting of interval precision when field restrictions are specified in the typmod.	2011-07-13 20:32:08 +03:00
Tom Lane	c1d9579dd8	Avoid listing ungrouped Vars in the targetlist of Agg-underneath-Window. Regular aggregate functions in combination with, or within the arguments of, window functions are OK per spec; they have the semantics that the aggregate output rows are computed and then we run the window functions over that row set. (Thus, this combination is not really useful unless there's a GROUP BY so that more than one aggregate output row is possible.) The case without GROUP BY could fail, as recently reported by Jeff Davis, because sloppy construction of the Agg node's targetlist resulted in extra references to possibly-ungrouped Vars appearing outside the aggregate function calls themselves. See the added regression test case for an example. Fixing this requires modifying the API of flatten_tlist and its underlying function pull_var_clause. I chose to make pull_var_clause's API for aggregates identical to what it was already doing for placeholders, since the useful behaviors turn out to be the same (error, report node as-is, or recurse into it). I also tightened the error checking in this area a bit: if it was ever valid to see an uplevel Var, Aggref, or PlaceHolderVar here, that was a long time ago, so complain instead of ignoring them. Backpatch into 9.1. The failure exists in 8.4 and 9.0 as well, but seeing that it only occurs in a basically-useless corner case, it doesn't seem worth the risks of changing a function API in a minor release. There might be third-party code using pull_var_clause.	2011-07-12 18:24:39 -04:00
Bruce Momjian	afc9635c60	Add C comment that txid_current() assigns an XID if one is not already assigned.	2011-07-11 20:33:07 -04:00
Peter Eisentraut	3315020a09	Fix and clarify information schema interval_precision fields The fields were previously wrongly typed as character_data; change to cardinal_number. Update the documentation and the implementation to show more clearly that this applies to a feature not available in PostgreSQL, rather than just not yet being implemented in the information schema.	2011-07-11 18:49:44 +03:00
Robert Haas	4240e429d0	Try to acquire relation locks in RangeVarGetRelid. In the previous coding, we would look up a relation in RangeVarGetRelid, lock the resulting OID, and then AcceptInvalidationMessages(). While this was sufficient to ensure that we noticed any changes to the relation definition before building the relcache entry, it didn't handle the possibility that the name we looked up no longer referenced the same OID. This was particularly problematic in the case where a table had been dropped and recreated: we'd latch on to the entry for the old relation and fail later on. Now, we acquire the relation lock inside RangeVarGetRelid, and retry the name lookup if we notice that invalidation messages have been processed meanwhile. Many operations that would previously have failed with an error in the presence of concurrent DDL will now succeed. There is a good deal of work remaining to be done here: many callers of RangeVarGetRelid still pass NoLock for one reason or another. In addition, nothing in this patch guards against the possibility that the meaning of an unqualified name might change due to the creation of a relation in a schema earlier in the user's search path than the one where it was previously found. Furthermore, there's nothing at all here to guard against similar race conditions for non-relations. For all that, it's a start. Noah Misch and Robert Haas	2011-07-08 22:19:30 -04:00
Tom Lane	9d522cb35d	Fix another oversight in logging of changes in postgresql.conf settings. We were using GetConfigOption to collect the old value of each setting, overlooking the possibility that it didn't exist yet. This does happen in the case of adding a new entry within a custom variable class, as exhibited in bug #6097 from Maxim Boguk. To fix, add a missing_ok parameter to GetConfigOption, but only in 9.1 and HEAD --- it seems possible that some third-party code is using that function, so changing its API in a minor release would cause problems. In 9.0, create a near-duplicate function instead.	2011-07-08 17:02:58 -04:00
Heikki Linnakangas	89fd72cbf2	Introduce a pipe between postmaster and each backend, which can be used to detect postmaster death. Postmaster keeps the write-end of the pipe open, so when it dies, children get EOF in the read-end. That can conveniently be waited for in select(), which allows eliminating some of the polling loops that check for postmaster death. This patch doesn't yet change all the loops to use the new mechanism, expect a follow-on patch to do that. This changes the interface to WaitLatch, so that it takes as argument a bitmask of events that it waits for. Possible events are latch set, timeout, postmaster death, and socket becoming readable or writeable. The pipe method behaves slightly differently from the kill() method previously used in PostmasterIsAlive() in the case that postmaster has died, but its parent has not yet read its exit code with waitpid(). The pipe returns EOF as soon as the process dies, but kill() continues to return true until waitpid() has been called (IOW while the process is a zombie). Because of that, change PostmasterIsAlive() to use the pipe too, otherwise WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while PostmasterIsAlive() would claim it's still alive. That could easily lead to busy-waiting while postmaster is in zombie state. Peter Geoghegan with further changes by me, reviewed by Fujii Masao and Florian Pflug.	2011-07-08 18:44:07 +03:00
Heikki Linnakangas	9598afa3b0	Fix one overflow and one signedness error, caused by the patch to calculate OLDSERXID_MAX_PAGE based on BLCKSZ. MSVC compiler warned about these.	2011-07-08 17:29:53 +03:00
Peter Eisentraut	f05c65090a	Message style improvements	2011-07-08 07:37:04 +03:00
Heikki Linnakangas	bdaabb9b22	There's a small window wherein a transaction is committed but not yet on the finished list, and we shouldn't flag it as a potential conflict if so. We can also skip adding a doomed transaction to the list of possible conflicts because we know it won't commit. Dan Ports and Kevin Grittner.	2011-07-08 00:36:30 +03:00
Heikki Linnakangas	406d61835b	SSI has a race condition, where the order of commit sequence numbers of transactions might not match the order the work done in those transactions become visible to others. The logic in SSI, however, assumed that it does. Fix that by having two sequence numbers for each serializable transaction, one taken before a transaction becomes visible to others, and one after it. This is easier than trying to make the the transition totally atomic, which would require holding ProcArrayLock and SerializableXactHashLock at the same time. By using prepareSeqNo instead of commitSeqNo in a few places where commit sequence numbers are compared, we can make those comparisons err on the safe side when we don't know for sure which committed first. Per analysis by Kevin Grittner and Dan Ports, but this approach to fix it is different from the original patch.	2011-07-07 23:26:34 +03:00
Tom Lane	60a81ad133	Reclassify replication-related GUC variables as "master" and "standby". Per discussion, this structure seems more understandable than what was there before. Make config.sgml and postgresql.conf.sample agree. In passing do a bit of editorial work on the variable descriptions.	2011-07-07 15:11:41 -04:00
Robert Haas	5b2b444f66	Adjust OLDSERXID_MAX_PAGE based on BLCKSZ. The value when BLCKSZ = 8192 is unchanged, but with larger-than-normal block sizes we might need to crank things back a bit, as we'll have more entries per page than normal in that case. Kevin Grittner	2011-07-07 15:05:21 -04:00
Tom Lane	a195e3c34f	Finish disabling reduced-lock-levels-for-DDL feature. Previous patch only covered the ALTER TABLE changes, not changes in other commands; and it neglected to revert the documentation changes.	2011-07-07 13:15:15 -04:00
Heikki Linnakangas	928408d9e5	Fix a bug with SSI and prepared transactions: If there's a dangerous structure T0 ---> T1 ---> T2, and T2 commits first, we need to abort something. If T2 commits before both conflicts appear, then it should be caught by OnConflict_CheckForSerializationFailure. If both conflicts appear before T2 commits, it should be caught by PreCommit_CheckForSerializationFailure. But that is actually run when T2 prepares. Fix that in OnConflict_CheckForSerializationFailure, by treating a prepared T2 as if it committed already. This is mostly a problem for prepared transactions, which are in prepared state for some time, but also for regular transactions because they also go through the prepared state in the SSI code for a short moment when they're committed. Kevin Grittner and Dan Ports	2011-07-07 18:12:15 +03:00
Tom Lane	14f67192c2	Remove assumptions that not-equals operators cannot be in any opclass. get_op_btree_interpretation assumed this in order to save some duplication of code, but it's not true in general anymore because we added <> support to btree_gist. (We still assume it for btree opclasses, though.) Also, essentially the same logic was baked into predtest.c. Get rid of that duplication by generalizing get_op_btree_interpretation so that it can be used by predtest.c. Per bug report from Denis de Bernardy and investigation by Jeff Davis, though I didn't use Jeff's patch exactly as-is. Back-patch to 9.1; we do not support this usage before that.	2011-07-06 14:53:16 -04:00
Tom Lane	2e56fa8632	Call FDW validator functions even when the options list is empty. This is useful since a validator might want to require certain options to be provided. The passed array is an empty text array in this case. Per suggestion by Laurenz Albe, though this is not quite his patch.	2011-07-05 18:21:12 -04:00
Peter Eisentraut	9a0bdc8db5	Message style improvements of errmsg_internal() calls	2011-07-05 23:01:35 +03:00
Peter Eisentraut	27af66162b	Message style tweaks	2011-07-05 00:01:35 +03:00
Peter Eisentraut	6fbc80349f	Set user_defined_types.data_type to null On re-reading the standard, this field is only used for distinct or reference types.	2011-07-04 23:09:42 +03:00
Alvaro Herrera	b93f5a5673	Move Trigger and TriggerDesc structs out of rel.h into a new reltrigger.h This lets us stop including rel.h into execnodes.h, which is a widely used header.	2011-07-04 14:35:58 -04:00
Alvaro Herrera	d665162077	Don't try to use a constraint name as domain name The bug that caused this to be discovered is that the code was trying to dereference a NULL or ill-defined pointer, as reported by Michael Mueller; but what it was doing was wrong anyway, per Heikki. This patch is Heikki's suggested fix.	2011-07-04 14:33:44 -04:00
Peter Eisentraut	9f084527a4	Remove unused variable to silence compiler warning	2011-07-04 18:03:17 +03:00
Heikki Linnakangas	f7ea6beaf4	Remove silent_mode. You get the same functionality with "pg_ctl -l postmaster.log", or nohup. There was a small issue with LINUX_OOM_ADJ and silent_mode, namely that with silent_mode the postmaster process incorrectly used the OOM settings meant for backend processes. We certainly could've fixed that directly, but since silent_mode was redundant anyway, we might as well just remove it.	2011-07-04 14:35:44 +03:00
Simon Riggs	2c3d9db56d	Reset ALTER TABLE lock levels to AccessExclusiveLock in all cases. Locks on inheritance parent remain at lower level, as they were before. Remove entry from 9.1 release notes.	2011-07-04 09:31:40 +01:00
Robert Haas	5da79169d3	Fix bugs in relpersistence handling during table creation. Unlike the relistemp field which it replaced, relpersistence must be set correctly quite early during the table creation process, as we rely on it quite early on for a number of purposes, including security checks. Normally, this is set based on whether the user enters CREATE TABLE, CREATE UNLOGGED TABLE, or CREATE TEMPORARY TABLE, but a relation may also be made implicitly temporary by creating it in pg_temp. This patch fixes the handling of that case, and also disables creation of unlogged tables in temporary tablespace (such table indeed skip WAL-logging, but we reject an explicit specification) and creation of relations in the temporary schemas of other sessions (which is not very sensible, and didn't work right anyway). Report by Amit Khandekar.	2011-07-03 17:34:47 -04:00
Magnus Hagander	24e2d4b6ba	Mark pg_stat_reset_shared as strict This is the proper fix for bug #6082 about pg_stat_reset_shared(NULL) causing a crash, and it reverts commit `79aa44536f` on head. The workaround of throwing an error from inside the function is left on backbranches (including 9.1) since this change requires a new initdb.	2011-07-03 13:15:58 +02:00
Tom Lane	426cafc46c	Suppress compiler warning about potentially uninitialized variable. Maybe some compilers are smart enough to not complain about the previous coding ... but mine isn't.	2011-07-01 20:57:34 -04:00
Alvaro Herrera	897795240c	Enable CHECK constraints to be declared NOT VALID This means that they can initially be added to a large existing table without checking its initial contents, but new tuples must comply to them; a separate pass invoked by ALTER TABLE / VALIDATE can verify existing data and ensure it complies with the constraint, at which point it is marked validated and becomes a normal part of the table ecosystem. An non-validated CHECK constraint is ignored in the planner for constraint_exclusion purposes; when validated, cached plans are recomputed so that partitioning starts working right away. This patch also enables domains to have unvalidated CHECK constraints attached to them as well by way of ALTER DOMAIN / ADD CONSTRAINT / NOT VALID, which can later be validated with ALTER DOMAIN / VALIDATE CONSTRAINT. Thanks to Thom Brown, Dean Rasheed and Jaime Casanova for the various reviews, and Robert Hass for documentation wording improvement suggestions. This patch was sponsored by Enova Financial.	2011-06-30 11:24:31 -04:00
Alvaro Herrera	b36927fbe9	Fix outdated comment Extracted from a patch by Bernd Helmle	2011-06-29 19:49:47 -04:00
Tom Lane	a5652d3e05	Restore correct btree preprocessing of "indexedcol IS NULL" conditions. Such a condition is unsatisfiable in combination with any other type of btree-indexable condition (since we assume btree operators are always strict). 8.3 and 8.4 had an explicit test for this, which I removed in commit `29c4ad9829`, mistakenly thinking that the case would be subsumed by the more general handling of IS (NOT) NULL added in that patch. Put it back, and improve the comments about it, and add a regression test case. Per bug #6079 from Renat Nasyrov, and analysis by Dean Rasheed.	2011-06-29 19:46:47 -04:00
Heikki Linnakangas	cd70dd6bef	Move the PredicateLockRelation() call from nodeSeqscan.c to heapam.c. It's more consistent that way, since all the other PredicateLock* calls are made in various heapam.c and index AM functions. The call in nodeSeqscan.c was unnecessarily aggressive anyway, there's no need to try to lock the relation every time a tuple is fetched, it's enough to do it once. This has the user-visible effect that if a seq scan is initialized in the executor, but never executed, we now acquire the predicate lock on the heap relation anyway. We could avoid that by taking the lock on the first heap_getnext() call instead, but it doesn't seem worth the trouble given that it feels more natural to do it in heap_beginscan(). Also, remove the retail PredicateLockTuple() calls from heap_getnext(). In a seqscan, started with heap_begin(), we're holding a whole-relation predicate lock on the heap so there's no need to lock the tuples individually. Kevin Grittner and me	2011-06-29 21:57:43 +03:00
Heikki Linnakangas	d9fe63acb0	Grab predicate locks on matching tuples in a lossy bitmap heap scan. Non-lossy case was already handled correctly. Kevin Grittner	2011-06-29 21:50:42 +03:00
Magnus Hagander	79aa44536f	Protect pg_stat_reset_shared() against NULL input Per bug #6082, reported by Steve Haslam	2011-06-29 19:36:51 +02:00
Peter Eisentraut	21f1e15aaf	Unify spelling of "canceled", "canceling", "cancellation" We had previously (`af26857a27`) established the U.S. spellings as standard.	2011-06-29 09:28:46 +03:00
Simon Riggs	465883b0a2	Introduce compact WAL record for the common case of commit (non-DDL). XLOG_XACT_COMMIT_COMPACT leaves out invalidation messages and relfilenodes, saving considerable space for the vast majority of transaction commits. XLOG_XACT_COMMIT keeps same definition as XLOG_PAGE_MAGIC 0xD067 and earlier. Leonardo Francalanci and Simon Riggs	2011-06-28 22:58:17 +01:00

... 3 4 5 6 7 ...

12439 Commits