postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-08-06 00:43:23 +02:00

Author	SHA1	Message	Date
Tom Lane	1c72d0dec1	Fix relcache to account properly for subtransaction status of 'new' relcache entries. Also, change TransactionIdIsCurrentTransactionId() so that if consulted during transaction abort, it will not say that the aborted xact is still current. (It would be better to ensure that it's never called at all during abort, but I'm not sure we can easily guarantee that.) In combination, these fix a crash we have seen occasionally during parallel regression tests of 8.0.	2004-08-28 20:31:44 +00:00
Tom Lane	f444dafab0	Can't truncate pg_subtrans during a recovery checkpoint --- subtrans module isn't fully initialized yet.	2004-08-28 18:18:03 +00:00
Tom Lane	fe455ee1d4	Revise ResourceOwner code to avoid accumulating ResourceOwner objects for every command executed within a transaction. For long transactions this was a significant memory leak. Instead, we can delete a portal's or subtransaction's ResourceOwner immediately, if we physically transfer the information about its locks up to the parent owner. This does not fully solve the leak problem; we need to do something about counting multiple acquisitions of the same lock in order to fix it. But it's a necessary step along the way.	2004-08-25 18:43:43 +00:00
Tom Lane	4dbb880d3c	Rearrange pg_subtrans handling as per recent discussion. pg_subtrans updates are no longer WAL-logged nor even fsync'd; we do not need to, since after a crash no old pg_subtrans data is needed again. We truncate pg_subtrans to RecentGlobalXmin at each checkpoint. slru.c's API is refactored a little bit to separate out the necessary decisions.	2004-08-23 23:22:45 +00:00
Tom Lane	f009c316ba	Tweak code so that pg_subtrans is never consulted for XIDs older than RecentXmin (== MyProc->xmin). This ensures that it will be safe to truncate pg_subtrans at RecentGlobalXmin, which should largely eliminate any fear of bloat. Along the way, eliminate SubTransXidsHaveCommonAncestor, which isn't really needed and could not give a trustworthy result anyway under the lookback restriction. In an unrelated but nearby change, #ifdef out GetUndoRecPtr, which has been dead code since 2001 and seems unlikely to ever be resurrected.	2004-08-22 02:41:58 +00:00
Tom Lane	19cd31b068	Fix bug introduced into _bt_getstackbuf() on 2003-Feb-21: the initial value of 'start' could be past the end of the page, if the page was split by some concurrent inserting process since we visited it. In this situation the code could look at bogus entries and possibly find a match (since after all those entries still contain what they had before the split). This would lead to 'specified item offset is too large' followed by 'PANIC: failed to add item to the page', as reported by Joe Conway for scenarios involving heavy concurrent insertion activity.	2004-08-17 23:15:33 +00:00
Tom Lane	1a3de15a3a	Dept. of further reflection: I looked around to see if any other callers of XLogInsert had the same sort of checkpoint interlock problem as RecordTransactionCommit, and indeed I found some. Btree index build and ALTER TABLE SET TABLESPACE write data outside the friendly confines of the buffer manager, and therefore they have to take their own responsibility for checkpoint interlock. The easiest solution seems to be to force smgrimmedsync at the end of the index build or table copy, even when the operation is being WAL-logged. This is sufficient since the new index or table will be of interest to no one if we don't get as far as committing the current transaction.	2004-08-15 23:44:46 +00:00
Bruce Momjian	10249abfa1	Cleanup Win32 COPY handling, and move archive examples to SGML.	2004-08-12 19:03:44 +00:00
Bruce Momjian	43ea65a0dc	Add mention of "WIN32" COPY.	2004-08-12 18:34:45 +00:00
Bruce Momjian	6525b42b10	Add make_native_path() because Win32 COPY is an internal CMD.EXE command and doesn't process forward slashes in the same way as external commands. Quoting the first argument to COPY does not convert forward to backward slashes, but COPY does properly process quoted forward slashes in the second argument. Win32 COPY works with quoted forward slashes in the first argument only if the current directory is the same as the directory of the first argument.	2004-08-12 18:32:52 +00:00
Tom Lane	3fdf649f4f	Fix failure to guarantee that a checkpoint will write out pg_clog updates for transaction commits that occurred just before the checkpoint. This is an EXTREMELY serious bug --- kudos to Satoshi Okada for creating a reproducible test case to prove its existence.	2004-08-11 04:07:16 +00:00
Tom Lane	35f539b481	When expanding %p in archive_command or restore_command, translate slashes to backslashes #ifdef WIN32. This is to cope with the fact that Windows seems exceedingly unfriendly to slashes in shell commands, as per recent discussion.	2004-08-09 16:26:06 +00:00
Tom Lane	7dca975c5d	Add a comment about why we always replay backup blocks from WAL.	2004-08-08 03:22:08 +00:00
Tom Lane	fcbc438727	Label CVS tip as 8.0devel instead of 7.5devel. Adjust various comments and documentation to reference 8.0 instead of 7.5.	2004-08-04 21:34:35 +00:00
Tom Lane	b387d16f96	Make use of backup label/history files to control recovery properly.	2004-08-04 16:25:02 +00:00
Tom Lane	58c41712d5	Add functions pg_start_backup, pg_stop_backup to create backup label and history files as per recent discussion. While at it, remove pg_terminate_backend, since we have decided we do not have time during this release cycle to address the reliability concerns it creates. Split the 'Miscellaneous Functions' documentation section into 'System Information Functions' and 'System Administration Functions', which hopefully will draw the eyes of those looking for such things.	2004-08-03 20:32:36 +00:00
Tom Lane	a83c45c4c6	Fix misplacement of savepointLevel test, per report from Chris K-L.	2004-08-03 15:57:26 +00:00
Tom Lane	410b1dfb88	Update the in-code documentation about the transaction system. Move it into a README file instead of being in xact.c's header comment. Alvaro Herrera.	2004-08-01 20:57:59 +00:00
Tom Lane	5cc380f9a3	Error message style adjustments, per Alvaro Herrera.	2004-08-01 17:45:43 +00:00
Tom Lane	efcaf1e868	Some mop-up work for savepoints (nested transactions). Store a small number of active subtransaction XIDs in each backend's PGPROC entry, and use this to avoid expensive probes into pg_subtrans during TransactionIdIsInProgress. Extend EOXactCallback API to allow add-on modules to get control at subxact start/end. (This is deliberately not compatible with the former API, since any uses of that API probably need manual review anyway.) Add basic reference documentation for SAVEPOINT and related commands. Minor other cleanups to check off some of the open issues for subtransactions. Alvaro Herrera and Tom Lane.	2004-08-01 17:32:22 +00:00
Tom Lane	beda4814c1	plpgsql does exceptions. There are still some things that need refinement; in particular I fear that the recognized set of error condition names probably has little in common with what Oracle recognizes. But it's a start.	2004-07-31 07:39:21 +00:00
Tom Lane	1bf3d61504	Fix subtransaction behavior for large objects, temp namespace, files, password/group files. Also allow read-only subtransactions of a read-write parent, but not vice versa. These are the reasonably noncontroversial parts of Alvaro's recent mop-up patch, plus further work on large objects to minimize use of the TopTransactionResourceOwner.	2004-07-28 14:23:31 +00:00
Tom Lane	cc813fc2b8	Replace nested-BEGIN syntax for subtransactions with spec-compliant SAVEPOINT/RELEASE/ROLLBACK-TO syntax. (Alvaro) Cause COMMIT of a failed transaction to report ROLLBACK instead of COMMIT in its command tag. (Tom) Fix a few loose ends in the nested-transactions stuff.	2004-07-27 05:11:48 +00:00
Tom Lane	acd907bfcc	Add cross-check that current timeline of pg_control is an ancestor of recovery_target_timeline --- otherwise there is no path from the backup to the requested timeline. This check was foreseen in the original discussion but I forgot to implement it.	2004-07-22 21:09:37 +00:00
Tom Lane	3dba9cb694	Add a check on file size as an additional safety check that a WAL file recovered from archive is not corrupt. It's not much but it will catch one common problem, viz out-of-disk-space. Also, force a WAL recovery scan when recovery.conf is present, even if pg_control shows a clean shutdown. This allows recovery with a tar backup that was taken with the postmaster shut down, as per complaint from Mark Kirkwood.	2004-07-22 20:18:40 +00:00
Tom Lane	2042b3428d	Invent WAL timelines, as per recent discussion, to make point-in-time recovery more manageable. Also, undo recent change to add FILE_HEADER and WASTED_SPACE records to XLOG; instead make the XLOG page header variable-size with extra fields in the first page of an XLOG file. This should fix the boundary-case bugs observed by Mark Kirkwood. initdb forced due to change of XLOG representation.	2004-07-21 22:31:26 +00:00
Tom Lane	9c7a765f02	Remove unportable use of strptime() to parse recovery target time spec. Instead use our own abstimein code, which is more flexible anyway.	2004-07-19 14:34:39 +00:00
Tom Lane	66ec2db728	XLOG file archiving and point-in-time recovery. There are still some loose ends and a glaring lack of documentation, but it basically works. Simon Riggs with some editorialization by Tom Lane.	2004-07-19 02:47:16 +00:00
Tom Lane	fe548629c5	Invent ResourceOwner mechanism as per my recent proposal, and use it to keep track of portal-related resources separately from transaction-related resources. This allows cursors to work in a somewhat sane fashion with nested transactions. For now, cursor behavior is non-subtransactional, that is a cursor's state does not roll back if you abort a subtransaction that fetched from the cursor. We might want to change that later.	2004-07-17 03:32:14 +00:00
Tom Lane	94d4d240bb	Rename XLOG_BTREE_NEWPAGE xlog record type into XLOG_HEAP_NEWPAGE, and shift support code into heapam.c accordingly. This is in service of soon-to-be-committed ALTER TABLE SET TABLESPACE code that will want to use this same record type for both heaps and indexes. Theoretically I should have forced initdb for this, but in practice there is no change in xlog contents because CVS tip will never really emit this record type anyhow...	2004-07-11 18:01:45 +00:00
Tom Lane	f5c798ee82	Fix no-longer-correct bit-pushing in TransactionIdSetStatus, per Alvaro.	2004-07-03 02:55:56 +00:00
Tom Lane	b6197fe069	Further review of xact.c state machine for nested transactions. Fix problems with starting subtransactions inside already-failed transactions. Clean up some comments.	2004-07-01 20:11:03 +00:00
Tom Lane	573a71a5da	Nested transactions. There is still much left to do, especially on the performance front, but with feature freeze upon us I think it's time to drive a stake in the ground and say that this will be in 7.5. Alvaro Herrera, with some help from Tom Lane.	2004-07-01 00:52:04 +00:00
Tom Lane	2467394ee1	Tablespaces. Alternate database locations are dead, long live tablespaces. There are various things left to do: contrib dbsize and oid2name modules need work, and so does the documentation. Also someone should think about COMMENT ON TABLESPACE and maybe RENAME TABLESPACE. Also initlocation is dead, it just doesn't know it yet. Gavin Sherry and Tom Lane.	2004-06-18 06:14:31 +00:00
Tom Lane	950d047ec5	Give inet/cidr datatypes their own hash function that ignores the inet vs cidr type bit, the same as network_eq does. This is needed for hash joins and hash aggregation to work correctly on these types. Per bug report from Michael Fuhr, 2004-04-13. Also, improve hash function for int8 as suggested by Greg Stark.	2004-06-13 21:57:28 +00:00
Tom Lane	c541bb86e9	Infrastructure for I/O of composite types: arrange for the I/O routines of a composite type to get that type's OID as their second parameter, in place of typelem which is useless. The actual changes are mostly centralized in getTypeInputInfo and siblings, but I had to fix a few places that were fetching pg_type.typelem for themselves instead of using the lsyscache.c routines. Also, I renamed all the related variables from 'typelem' to 'typioparam' to discourage people from assuming that they necessarily contain array element types.	2004-06-06 00:41:28 +00:00
Tom Lane	c3a153afed	Tweak palloc/repalloc to allow zero bytes to be requested, as per recent proposal. Eliminate several dozen now-unnecessary hacks to avoid palloc(0). (It's likely there are more that I didn't find.)	2004-06-05 19:48:09 +00:00
Tom Lane	ae93e5fd6e	Make the world very nearly safe for composite-type columns in tables. 1. Solve the problem of not having TOAST references hiding inside composite values by establishing the rule that toasting only goes one level deep: a tuple can contain toasted fields, but a composite-type datum that is to be inserted into a tuple cannot. Enforcing this in heap_formtuple is relatively cheap and it avoids a large increase in the cost of running the tuptoaster during final storage of a row. 2. Fix some interesting problems in expansion of inherited queries that reference whole-row variables. We never really did this correctly before, but it's now relatively painless to solve by expanding the parent's whole-row Var into a RowExpr() selecting the proper columns from the child. If you dike out the preventive check in CheckAttributeType(), composite-type columns now seem to actually work. However, we surely cannot ship them like this --- without I/O for composite types, you can't get pg_dump to dump tables containing them. So a little more work still to do.	2004-06-05 01:55:05 +00:00
Tom Lane	8f2ea8b7b5	Resurrect heap_deformtuple(), this time implemented as a singly nested loop over the fields instead of a loop around heap_getattr. This is considerably faster (O(N) instead of O(N^2)) when there are nulls or varlena fields, since those prevent use of attcacheoff. Replace loops over heap_getattr with heap_deformtuple in situations where all or most of the fields have to be fetched, such as printtup and tuptoaster. Profiling done more than a year ago shows that this should be a nice win for situations involving many-column tables.	2004-06-04 20:35:21 +00:00
Tom Lane	921d749bd4	Adjust our timezone library to use pg_time_t (typedef'd as int64) in place of time_t, as per prior discussion. The behavior does not change on machines without a 64-bit-int type, but on machines with one, which is most, we are rid of the bizarre boundary behavior at the edges of the 32-bit-time_t range (1901 and 2038). The system will now treat times over the full supported timestamp range as being in your local time zone. It may seem a little bizarre to consider that times in 4000 BC are PST or EST, but this is surely at least as reasonable as propagating Gregorian calendar rules back that far. I did not modify the format of the zic timezone database files, which means that for the moment the system will not know about daylight-savings periods outside the range 1901-2038. Given the way the files are set up, it's not a simple decision like 'widen to 64 bits'; we have to actually think about the range of years that need to be supported. We should probably inquire what the plans of the upstream zic people are before making any decisions of our own.	2004-06-03 02:08:07 +00:00
Tom Lane	2095206de1	Adjust btree index build to not use shared buffers, thereby avoiding the locking conflict against concurrent CHECKPOINT that was discussed a few weeks ago. Also, if not using WAL archiving (which is always true ATM but won't be if PITR makes it into this release), there's no need to WAL-log the index build process; it's sufficient to force-fsync the completed index before commit. This seems to gain about a factor of 2 in my tests, which is consistent with writing half as much data. I did not try it with WAL on a separate drive though --- probably the gain would be a lot less in that scenario.	2004-06-02 17:28:18 +00:00
Tom Lane	e674707968	Minor code rationalization: FlushRelationBuffers just returns void, rather than an error code, and does elog(ERROR) not elog(WARNING) when it detects a problem. All callers were simply elog(ERROR)'ing on failure return anyway, and I find it hard to envision a caller that would not, so we may as well simplify the callers and produce the more useful error message directly.	2004-05-31 19:24:05 +00:00
Tom Lane	9b178555fc	Per previous discussions, get rid of use of sync(2) in favor of explicitly fsync'ing every (non-temp) file we have written since the last checkpoint. In the vast majority of cases, the burden of the fsyncs should fall on the bgwriter process not on backends. (To this end, we assume that an fsync issued by the bgwriter will force out blocks written to the same file by other processes using other file descriptors. Anyone have a problem with that?) This makes the world safe for WIN32, which ain't even got sync(2), and really makes the world safe for Unixen as well, because sync(2) never had the semantics we need: it offers no way to wait for the requested I/O to finish. Along the way, fix a bug I recently introduced in xlog recovery: file truncation replay failed to clear bufmgr buffers for the dropped blocks, which could result in 'PANIC: heap_delete_redo: no block' later on in xlog replay.	2004-05-31 03:48:10 +00:00
Neil Conway	72b6ad6313	Use the new List API function names throughout the backend, and disable the list compatibility API by default. While doing this, I decided to keep the llast() macro around and introduce llast_int() and llast_oid() variants.	2004-05-30 23:40:41 +00:00
Tom Lane	076a055acf	Separate out bgwriter code into a logically separate module, rather than being random pieces of other files. Give bgwriter responsibility for all checkpoint activity (other than a post-recovery checkpoint); so this child process absorbs the functionality of the former transient checkpoint and shutdown subprocesses. While at it, create an actual include file for postmaster.c, which for some reason never had its own file before.	2004-05-29 22:48:23 +00:00
Tom Lane	1a321f26d8	Code review for EXEC_BACKEND changes. Reduce the number of #ifdefs by about a third, make it work on non-Windows platforms again. (But perhaps I broke the WIN32 code, since I have no way to test that.) Fold all the paths that fork postmaster child processes to go through the single routine SubPostmasterMain, which takes care of resurrecting the state that would normally be inherited from the postmaster (including GUC variables). Clean up some places where there's no particularly good reason for the EXEC and non-EXEC cases to work differently. Take care of one or two FIXMEs that remained in the code.	2004-05-28 05:13:32 +00:00
Tom Lane	16974ee910	Get rid of the former rather baroque mechanism for propagating the values of ThisStartUpID and RedoRecPtr into new backends. It's a lot easier just to make them all grab the values out of shared memory during startup. This helps to decouple the postmaster from checkpoint execution, which I need since I'm intending to let the bgwriter do it instead, and it also fixes a bug in the Win32 port: ThisStartUpID wasn't getting propagated at all AFAICS. (Doesn't give me a lot of faith in the amount of testing that port has gotten.)	2004-05-27 17:12:57 +00:00
Neil Conway	d0b4399d81	Reimplement the linked list data structure used throughout the backend. In the past, we used a 'Lispy' linked list implementation: a "list" was merely a pointer to the head node of the list. The problem with that design is that it makes lappend() and length() linear time. This patch fixes that problem (and others) by maintaining a count of the list length and a pointer to the tail node along with each head node pointer. A "list" is now a pointer to a structure containing some meta-data about the list; the head and tail pointers in that structure refer to ListCell structures that maintain the actual linked list of nodes. The function names of the list API have also been changed to, I hope, be more logically consistent. By default, the old function names are still available; they will be disabled-by-default once the rest of the tree has been updated to use the new API names.	2004-05-26 04:41:50 +00:00
Tom Lane	4d86ae4260	For multi-table ANALYZE, use per-table transactions when possible (ie, when not inside a transaction block), so that we can avoid holding locks longer than necessary. Per trouble report from Philip Warner.	2004-05-22 23:14:38 +00:00
Tom Lane	e6319d1d28	Put back #include <sys/time.h> in files that seem to need it on Linux.	2004-05-21 16:08:47 +00:00

1 2 3 4 5 ...

877 Commits