postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-10-07 10:36:49 +02:00

Author	SHA1	Message	Date
Andres Freund	31c453165b	Commonalize process startup code. Move common code, that was duplicated in every postmaster child/every standalone process, into two functions in miscinit.c. Not only does that already result in a fair amount of net code reduction but it also makes it much easier to remove more duplication in the future. The prime motivation wasn't code deduplication though, but easier addition of new common code.	2015-01-14 00:33:14 +01:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Tom Lane	4a14f13a0a	Improve hash_create's API for selecting simple-binary-key hash functions. Previously, if you wanted anything besides C-string hash keys, you had to specify a custom hashing function to hash_create(). Nearly all such callers were specifying tag_hash or oid_hash; which is tedious, and rather error-prone, since a caller could easily miss the opportunity to optimize by using hash_uint32 when appropriate. Replace this with a design whereby callers using simple binary-data keys just specify HASH_BLOBS and don't need to mess with specific support functions. hash_create() itself will take care of optimizing when the key size is four bytes. This nets out saving a few hundred bytes of code space, and offers a measurable performance improvement in tidbitmap.c (which was not exploiting the opportunity to use hash_uint32 for its 4-byte keys). There might be some wins elsewhere too, I didn't analyze closely. In future we could look into offering a similar optimized hashing function for 8-byte keys. Under this design that could be done in a centralized and machine-independent fashion, whereas getting it right for keys of platform-dependent sizes would've been notationally painful before. For the moment, the old way still works fine, so as not to break source code compatibility for loadable modules. Eventually we might want to remove tag_hash and friends from the exported API altogether, since there's no real need for them to be explicitly referenced from outside dynahash.c. Teodor Sigaev and Tom Lane	2014-12-18 13:36:36 -05:00
Fujii Masao	38628db8d8	Add memory barriers for PgBackendStatus.st_changecount protocol. st_changecount protocol needs the memory barriers to ensure that the apparent order of execution is as it desires. Otherwise, for example, the CPU might rearrange the code so that st_changecount is incremented twice before the modification on a machine with weak memory ordering. This surprising result can lead to bugs. This commit introduces the macros to load and store st_changecount with the memory barriers. These are called before and after PgBackendStatus entries are modified or copied into private memory, in order to prevent CPU from reordering PgBackendStatus access. Per discussion on pgsql-hackers, we decided not to back-patch this to 9.4 or before until we get an actual bug report about this. Patch by me. Review by Robert Haas.	2014-12-18 23:07:51 +09:00
Tom Lane	06d5803ffa	Fix assorted confusion between Oid and int32. In passing, also make some debugging elog's in pgstat.c a bit more consistently worded. Back-patch as far as applicable (9.3 or 9.4; none of these mistakes are really old). Mark Dilger identified and patched the type violations; the message rewordings are mine.	2014-12-11 15:41:15 -05:00
Kevin Grittner	ac46de56ea	Smooth reporting of commit/rollback statistics. If a connection committed or rolled back any transactions within a PGSTAT_STAT_INTERVAL pacing interval without accessing any tables, the reporting of those statistics would be held up until the connection closed or until it ended a PGSTAT_STAT_INTERVAL interval in which it had accessed a table. This could result in under- reporting of transactions for an extended period, followed by a spike in reported transactions. While this is arguably a bug, the impact is minimal, primarily affecting, and being affected by, monitoring software. It might cause more confusion than benefit to change the existing behavior in released stable branches, so apply only to master and the 9.4 beta. Gurjeet Singh, with review and editing by Kevin Grittner, incorporating suggested changes from Abhijit Menon-Sen and Tom Lane.	2014-07-02 15:20:30 -05:00
Fujii Masao	654e8e4447	Save pg_stat_statements statistics file into $PGDATA/pg_stat directory at shutdown. `187492b6c2` changed pgstat.c so that the stats files were saved into $PGDATA/pg_stat directory when the server was shutdowned. But it accidentally forgot to change the location of pg_stat_statements permanent stats file. This commit fixes pg_stat_statements so that its stats file is also saved into $PGDATA/pg_stat at shutdown. Since this fix changes the file layout, we don't back-patch it to 9.3 where this oversight was introduced.	2014-06-04 12:09:45 +09:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Tom Lane	cad4fe6455	Use AF_UNSPEC not PF_UNSPEC in getaddrinfo calls. According to the Single Unix Spec and assorted man pages, you're supposed to use the constants named AF_xxx when setting ai_family for a getaddrinfo call. In a few places we were using PF_xxx instead. Use of PF_xxx appears to be an ancient BSD convention that was not adopted by later standardization. On BSD and most later Unixen, it doesn't matter much because those constants have equivalent values anyway; but nonetheless this code is not per spec. In the same vein, replace PF_INET by AF_INET in one socket() call, which wasn't even consistent with the other socket() call in the same function let alone the remainder of our code. Per investigation of a Cygwin trouble report from Marco Atzeri. It's probably a long shot that this will fix his issue, but it's wrong in any case.	2014-04-16 13:21:20 -04:00
Tom Lane	682c5bbec5	Fix bugs in manipulation of PgBackendStatus.st_clienthostname. Initialization of this field was not being done according to the st_changecount protocol (it has to be done within the changecount increment range, not outside). And the test to see if the value should be reported as null was wrong. Noted while perusing uses of Port.remote_hostname. This was wrong from the introduction of this code (commit `4a25bc145`), so back-patch to 9.1.	2014-04-01 21:30:34 -04:00
Robert Haas	79a4d24f31	Make it easy to detach completely from shared memory. The new function dsm_detach_all() can be used either by postmaster children that don't wish to take any risk of accidentally corrupting shared memory; or by forked children of regular backends with the same need. This patch also updates the postmaster children that already do PGSharedMemoryDetach() to do dsm_detach_all() as well. Per discussion with Tom Lane.	2014-03-18 07:58:53 -04:00
Alvaro Herrera	2b4f2ab33d	Remove the correct pgstat file on DROP DATABASE We were unlinking the permanent file, not the non-permanent one. But since the stat collector already unlinks all permanent files on startup, there was nothing for it to unlink. The non-permanent file remained in place, and was copied to the permanent directory on shutdown, so in effect no file was ever dropped. Backpatch to 9.3, where the issue was introduced by commit `187492b6c2`. Before that, there were no per-database files and thus no file to drop on DROP DATABASE. Per report from Thom Brown. Author: Tomáš Vondra	2014-03-05 13:03:29 -03:00
Robert Haas	dd1a3bccca	Show xid and xmin in pg_stat_activity and pg_stat_replication. Christian Kruse, reviewed by Andres Freund and myself, with further minor adjustments by me.	2014-02-25 12:34:04 -05:00
Fujii Masao	9132b189bf	Add pg_stat_archiver statistics view. This view shows the statistics about the WAL archiver process's activity. Gabriele Bartolini, reviewed by Michael Paquier, refactored a bit by me.	2014-01-29 02:58:22 +09:00
Tom Lane	115f414124	Fix VACUUM's reporting of dead-tuple counts to the stats collector. Historically, VACUUM has just reported its new_rel_tuples estimate (the same thing it puts into pg_class.reltuples) to the stats collector. That number counts both live and dead-but-not-yet-reclaimable tuples. This behavior may once have been right, but modern versions of the pgstats code track live and dead tuple counts separately, so putting the total into n_live_tuples and zero into n_dead_tuples is surely pretty bogus. Fix it to report live and dead tuple counts separately. This doesn't really do much for situations where updating transactions commit concurrently with a VACUUM scan (possibly causing double-counting or omission of the tuples they add or delete); but it's clearly an improvement over what we were doing before. Hari Babu, reviewed by Amit Kapila	2014-01-18 19:24:33 -05:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Tom Lane	a3b4aeecfe	Ooops, should use double not single quotes in StaticAssertStmt(). That's what I get for testing this on an older compiler.	2014-01-02 21:54:20 -05:00
Tom Lane	a7ef273e1c	Fix calculation of maximum statistics-message size. The PGSTAT_NUM_TABENTRIES macro should have been updated when new fields were added to struct PgStat_MsgTabstat in commit `644828908`, but it wasn't. Fix that. Also, add a static assertion that we didn't overrun the intended size limit on stats messages. This will not necessarily catch every mistake in computing the maximum array size for stats messages, but it will catch ones that have practical consequences. (The assertion in fact doesn't complain about the aforementioned error in PGSTAT_NUM_TABENTRIES, because that was not big enough to cause the array length to increase.) No back-patch, as there's no actual bug in existing releases; this is just in the nature of future-proofing. Mark Dilger and Tom Lane	2014-01-02 21:45:51 -05:00
Tom Lane	20fe870753	Be more wary of unwanted whitespace in pgstat_reset_remove_files(). sscanf isn't the easiest thing to use for exact pattern checks ... also, don't use strncmp where strcmp will do.	2013-08-19 19:36:04 -04:00
Alvaro Herrera	f9b50b7c18	Fix removal of files in pgstats directories Instead of deleting all files in stats_temp_directory and the permanent directory on a crash, only remove those files that match the pattern of files we actually write in them, to avoid possibly clobbering existing unrelated contents of the temporary directory. Per complaint from Jeff Janes, and subsequent discussion, starting at message CAMkU=1z9+7RsDODnT4=cDFBRBp8wYQbd_qsLcMtKEf-oFwuOdQ@mail.gmail.com Also, fix a bug in the same routine to avoid removing files from the permanent directory twice (instead of once from that directory and then from the temporary directory), also per report from Jeff Janes, in message CAMkU=1wbk947=-pAosDMX5VC+sQw9W4ttq6RM9rXu=MjNeEQKA@mail.gmail.com	2013-08-19 17:48:17 -04:00
Tom Lane	fa2fad3c06	Improve ilist.h's support for deletion of slist elements during iteration. Previously one had to use slist_delete(), implying an additional scan of the list, making this infrastructure considerably less efficient than traditional Lists when deletion of element(s) in a long list is needed. Modify the slist_foreach_modify() macro to support deleting the current element in O(1) time, by keeping a "prev" pointer in addition to "cur" and "next". Although this makes iteration with this macro a bit slower, no real harm is done, since in any scenario where you're not going to delete the current list element you might as well just use slist_foreach instead. Improve the comments about when to use each macro. Back-patch to 9.3 so that we'll have consistent semantics in all branches that provide ilist.h. Note this is an ABI break for callers of slist_foreach_modify(). Andres Freund and Tom Lane	2013-07-24 17:42:34 -04:00
Robert Haas	568d4138c6	Use an MVCC snapshot, rather than SnapshotNow, for catalog scans. SnapshotNow scans have the undesirable property that, in the face of concurrent updates, the scan can fail to see either the old or the new versions of the row. In many cases, we work around this by requiring DDL operations to hold AccessExclusiveLock on the object being modified; in some cases, the existing locking is inadequate and random failures occur as a result. This commit doesn't change anything related to locking, but will hopefully pave the way to allowing lock strength reductions in the future. The major issue has held us back from making this change in the past is that taking an MVCC snapshot is significantly more expensive than using a static special snapshot such as SnapshotNow. However, testing of various worst-case scenarios reveals that this problem is not severe except under fairly extreme workloads. To mitigate those problems, we avoid retaking the MVCC snapshot for each new scan; instead, we take a new snapshot only when invalidation messages have been processed. The catcache machinery already requires that invalidation messages be sent before releasing the related heavyweight lock; else other backends might rely on locally-cached data rather than scanning the catalog at all. Thus, making snapshot reuse dependent on the same guarantees shouldn't break anything that wasn't already subtly broken. Patch by me. Review by Michael Paquier and Andres Freund.	2013-07-02 09:47:01 -04:00
Bruce Momjian	9af4159fce	pgindent run for release 9.3 This is the first run of the Perl-based pgindent script. Also update pgindent instructions.	2013-05-29 16:58:43 -04:00
Tom Lane	f7b0006f42	Avoid updating our PgBackendStatus entry when track_activities is off. The point of turning off track_activities is to avoid this reporting overhead, but a thinko in commit `4f42b546fd` caused pgstat_report_activity() to perform half of its updates anyway. Fix that, and also make sure that we clear all the now-disabled fields when transitioning to the non-reporting state.	2013-04-03 14:13:28 -04:00
Kevin Grittner	3bf3ab8c56	Add a materialized view relations. A materialized view has a rule just like a view and a heap and other physical properties like a table. The rule is only used to populate the table, references in queries refer to the materialized data. This is a minimal implementation, but should still be useful in many cases. Currently data is only populated "on demand" by the CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements. It is expected that future releases will add incremental updates with various timings, and that a more refined concept of defining what is "fresh" data will be developed. At some point it may even be possible to have queries use a materialized in place of references to underlying tables, but that requires the other above-mentioned features to be working first. Much of the documentation work by Robert Haas. Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja Security review by KaiGai Kohei, with a decision on how best to implement sepgsql still pending.	2013-03-03 18:23:31 -06:00
Alvaro Herrera	6e3fd96463	Remove useless variable Per Jeff Janes	2013-02-21 11:46:46 -03:00
Alvaro Herrera	187492b6c2	Split pgstat file in smaller pieces We now write one file per database and one global file, instead of having the whole thing in a single huge file. This reduces the I/O that must be done when partial data is required -- which is all the time, because each process only needs information on its own database anyway. Also, the autovacuum launcher does not need data about tables and functions in each database; having the global stats for all DBs is enough. Catalog version bumped because we have a new subdir under PGDATA. Author: Tomas Vondra. Some rework by Álvaro Testing by Jeff Janes Other discussion by Heikki Linnakangas, Tom Lane.	2013-02-18 18:12:52 -03:00
Tom Lane	c5aad8dc14	Fix possible failure to send final transaction counts to stats collector. Normally, we suppress sending a tabstats message to the collector unless there were some actual table stats to send. However, during backend exit we should force out the message if there are any transaction commit/abort counts to send, else the session's last few commit/abort counts will never get reported at all. We had logic for this, but the short-circuit test at the top of pgstat_report_stat() ignored the "force" flag, with the consequence that session-ending transactions that touched no database-local tables would not get counted. Seems to be an oversight in my commit `641912b4d1`, which added the "force" flag. That was back in 8.3, so back-patch to all supported versions.	2013-02-07 14:44:00 -05:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Tom Lane	e81e8f9342	Split up process latch initialization for more-fail-soft behavior. In the previous coding, new backend processes would attempt to create their self-pipe during the OwnLatch call in InitProcess. However, pipe creation could fail if the kernel is short of resources; and the system does not recover gracefully from a FATAL error right there, since we have armed the dead-man switch for this process and not yet set up the on_shmem_exit callback that would disarm it. The postmaster then forces an unnecessary database-wide crash and restart, as reported by Sean Chittenden. There are various ways we could rearrange the code to fix this, but the simplest and sanest seems to be to split out creation of the self-pipe into a new function InitializeLatchSupport, which must be called from a place where failure is allowed. For most processes that gets called in InitProcess or InitAuxiliaryProcess, but processes that don't call either but still use latches need their own calls. Back-patch to 9.1, which has only a part of the latch logic that 9.2 and HEAD have, but nonetheless includes this bug.	2012-10-14 22:59:56 -04:00
Alvaro Herrera	c219d9b0a5	Split tuple struct defs from htup.h to htup_details.h This reduces unnecessary exposure of other headers through htup.h, which is very widely included by many files. I have chosen to move the function prototypes to the new file as well, because that means htup.h no longer needs to include tupdesc.h. In itself this doesn't have much effect in indirect inclusion of tupdesc.h throughout the tree, because it's also required by execnodes.h; but it's something to explore in the future, and it seemed best to do the htup.h change now while I'm busy with it.	2012-08-30 16:52:35 -04:00
Peter Eisentraut	eeece9e609	Unify calling conventions for postgres/postmaster sub-main functions There was a wild mix of calling conventions: Some were declared to return void and didn't return, some returned an int exit code, some claimed to return an exit code, which the callers checked, but actually never returned, and so on. Now all of these functions are declared to return void and decorated with attribute noreturn and don't return. That's easiest, and most code already worked that way.	2012-06-25 21:30:12 +03:00
Tom Lane	9e18eacbdf	Fix stats collector to recover nicely when system clock goes backwards. Formerly, if the system clock went backwards, the stats collector would fail to update the stats file any more until the clock reading again exceeds whatever timestamp was last written into the stats file. Such glitches in the clock's behavior are not terribly unlikely on machines not using NTP. Such a scenario has been observed to cause regression test failures in the buildfarm, and it could have bad effects on the behavior of autovacuum, so it seems prudent to install some defenses. We could directly detect the clock going backwards by adding GetCurrentTimestamp calls in the stats collector's main loop, but that would hurt performance on platforms where GetCurrentTimestamp is expensive. To minimize the performance hit in normal cases, adopt a more complicated scheme wherein backends check for clock skew when reading the stats file, and if they see it, signal the stats collector by sending an extra stats inquiry message. The stats collector does an extra GetCurrentTimestamp only when it receives an inquiry with an apparently out-of-order timestamp. To avoid unnecessary GetCurrentTimestamp calls, expand the inquiry messages to carry the backend's current clock reading as well as its stats cutoff time. The latter, being intentionally slightly in-the-past, would trigger more clock rechecks than we need if it were used for this purpose. We might want to backpatch this change at some point, but let's let it shake out in the buildfarm for awhile first.	2012-06-17 17:11:49 -04:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Tom Lane	9b63e9869f	In pgstat.c, use a timeout in WaitLatchOrSocket only on Windows. We have no need for a timeout here really, but some broken products from Redmond seem to lose FD_READ events occasionally, and waking up and retrying the recv() is the only known way to work around that. Perhaps somebody will be motivated to figure out a better answer here; but not I.	2012-05-14 23:51:34 -04:00
Tom Lane	5a2bb06012	Revert "Add some temporary instrumentation to pgstat.c." This reverts commit `7d88bb73f7`. That instrumentation has served its purpose.	2012-05-14 23:08:10 -04:00
Tom Lane	d461d0502b	For testing purposes, reinsert a timeout in pgstat.c's wait call. Test results from buildfarm members mastodon/narwhal (Windows Server 2003) make it look like that platform just plain loses FD_READ events occasionally, and the only reason our previous coding seemed to work was that it timed out every couple of seconds and retried the whole operation. Try to verify this by reinserting a finite timeout into the pgstat loop. This isn't meant to be a permanent patch either, just to confirm or disprove a theory.	2012-05-14 15:03:14 -04:00
Tom Lane	f1ca51549e	Force pgwin32_recv into nonblock mode when called from pgstat.c. This should get rid of the usage of pgwin32_waitforsinglesocket entirely, and perhaps thereby remove the race condition that's evidently still present on some versions of Windows. The previous arrangement was a bit unsafe anyway, since waiting at the recv() would not allow pgstat to notice postmaster death.	2012-05-14 10:57:07 -04:00
Tom Lane	7d88bb73f7	Add some temporary instrumentation to pgstat.c. Log main-loop blocking events and the results of inquiry messages. This is to get some clarity as to what's happening on those Windows buildfarm members that still don't like the latch-ified stats collector. This bulks up the postmaster log a tad, so I won't leave it in place for long.	2012-05-13 21:11:31 -04:00
Tom Lane	966970ed63	Re-revert stats collector latch changes. This reverts commit `cb2f2873d6`, restoring the latch-ified stats collector logic. We'll soon see if this works any better on the Windows buildfarm machines.	2012-05-13 14:44:39 -04:00
Tom Lane	cb2f2873d6	Temporarily revert stats collector latch changes so we can ship beta1. This patch reverts commit `49340037ee` and some follow-on tweaking in pgstat.c. While the basic scheme of latch-ifying the stats collector seems sound enough, it's failing on most Windows buildfarm members for unknown reasons, and there's no time left to debug that before 9.2beta1. Better to ship a beta version without this improvement. I hope to re-revert this once beta1 is out, though.	2012-05-10 17:26:33 -04:00
Tom Lane	f40022f1ad	Make WaitLatch's WL_POSTMASTER_DEATH result trustworthy; simplify callers. Per a suggestion from Peter Geoghegan, make WaitLatch responsible for verifying that the WL_POSTMASTER_DEATH bit it returns is truthful (by testing PostmasterIsAlive). Then simplify its callers, who no longer need to do that for themselves. Remove weasel wording about falsely-set result bits from WaitLatch's API contract.	2012-05-10 14:34:53 -04:00
Tom Lane	fd71421b01	Improve tests for postmaster death in auxiliary processes. In checkpointer and walwriter, avoid calling PostmasterIsAlive unless WaitLatch has reported WL_POSTMASTER_DEATH. This saves a kernel call per iteration of the process's outer loop, which is not all that much, but a cycle shaved is a cycle earned. I had already removed the unconditional PostmasterIsAlive calls in bgwriter and pgstat in previous patches, but forgot that WL_POSTMASTER_DEATH is supposed to be treated as untrustworthy (per comment in unix_latch.c); so adjust those two cases to match. There are a few other places where the same idea might be applied, but only after substantial code rearrangement, so I didn't bother.	2012-05-10 00:54:56 -04:00
Tom Lane	49340037ee	Reduce idle power consumption of stats collector process. Latch-ify the stats collector, so that it does not need an arbitrary wakeup cycle to check for postmaster death. The incremental savings in idle power is pretty marginal, since we only had it waking every two seconds; but I believe that this patch may also improve the collector's performance under load, by reducing the number of kernel calls made per message when messages are arriving constantly (we now avoid a select/poll call except when we need to sleep). The change also reduces the time needed for a normal database shutdown on platforms where signals don't interrupt select().	2012-05-08 21:26:46 -04:00
Tom Lane	809e7e21af	Converge all SQL-level statistics timing values to float8 milliseconds. This patch adjusts the core statistics views to match the decision already taken for pg_stat_statements, that values representing elapsed time should be represented as float8 and measured in milliseconds. By using float8, we are no longer tied to a specific maximum precision of timing data. (Internally, it's still microseconds, but we could now change that without needing changes at the SQL level.) The columns affected are pg_stat_bgwriter.checkpoint_write_time pg_stat_bgwriter.checkpoint_sync_time pg_stat_database.blk_read_time pg_stat_database.blk_write_time pg_stat_user_functions.total_time pg_stat_user_functions.self_time pg_stat_xact_user_functions.total_time pg_stat_xact_user_functions.self_time The first four of these are new in 9.2, so there is no compatibility issue from changing them. The others require a release note comment that they are now double precision (and can show a fractional part) rather than bigint as before; also their underlying statistics functions now match the column definitions, instead of returning bigint microseconds.	2012-04-30 14:03:33 -04:00
Tom Lane	1dd89eadcd	Rename I/O timing statistics columns to blk_read_time and blk_write_time. This seems more consistent with the pre-existing choices for names of other statistics columns. Rename assorted internal identifiers to match.	2012-04-29 18:13:33 -04:00
Tom Lane	cdbad241f4	Clear I/O timing counters after sending them to the stats collector. This oversight caused the reported times to accumulate in an O(N^2) fashion the longer a backend runs.	2012-04-28 15:11:13 -04:00
Robert Haas	b736aef2ec	Publish checkpoint timing information to pg_stat_bgwriter. Greg Smith, Peter Geoghegan, and Robert Haas	2012-04-05 14:04:37 -04:00
Robert Haas	644828908f	Expose track_iotiming data via the statistics collector. Ants Aasma's original patch to add timing information for buffer I/O requests exposed this data at the relation level, which was judged too costly. I've here exposed it at the database level instead.	2012-04-05 11:40:24 -04:00
Magnus Hagander	7729e22d83	Fix a copy/pasted typo in several comments	2012-01-26 16:02:33 +01:00

1 2 3 4 5 ...

280 Commits