Commit Graph

1012 Commits

Author SHA1 Message Date
Tom Lane f4d242ef94 Remove some unnecessary tests of pgstat_track_counts.
We may as well make pgstat_count_heap_scan() and related macros just count
whenever rel->pgstat_info isn't null.  Testing pgstat_track_counts buys
nothing at all in the normal case where that flag is ON; and when it's OFF,
the pgstat_info link will be null, so it's still a useless test.

This change is unlikely to buy any noticeable performance improvement,
but a cycle shaved is a cycle earned; and my investigations earlier today
convinced me that we're down to the point where individual instructions in
the inner execution loops are starting to matter.
2010-10-12 14:44:25 -04:00
Magnus Hagander 9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Magnus Hagander 594419e74a Treat exit code 128 (ERROR_WAIT_NO_CHILDREN) as non-fatal on Win32,
since it can happen when a process fails to start when the system
is under high load.

Per several bug reports and many peoples investigation.

Back-patch to 8.4, which is as far back as the "deadman-switch"
for shared memory access exists.
2010-09-16 20:37:13 +00:00
Magnus Hagander 946045f04d Add vacuum and analyze counters to pg_stat_*_tables views. 2010-08-21 10:59:17 +00:00
Robert Haas debcec7dc3 Include the backend ID in the relpath of temporary relations.
This allows us to reliably remove all leftover temporary relation
files on cluster startup without reference to system catalogs or WAL;
therefore, we no longer include temporary relations in XLOG_XACT_COMMIT
and XLOG_XACT_ABORT WAL records.

Since these changes require including a backend ID in each
SharedInvalSmgrMsg, the size of the SharedInvalidationMessage.id
field has been reduced from two bytes to one, and the maximum number
of connections has been reduced from INT_MAX / 4 to 2^23-1.  It would
be possible to remove these restrictions by increasing the size of
SharedInvalidationMessage by 4 bytes, but right now that doesn't seem
like a good trade-off.

Review by Jaime Casanova and Tom Lane.
2010-08-13 20:10:54 +00:00
Tom Lane 46aa77c7bd Add stats functions and views to provide access to a transaction's own
statistics counts.  These numbers are being accumulated but haven't yet been
transmitted to the collector (and won't be, until the transaction ends).
For some purposes, though, it's handy to be able to look at them.

Joel Jacobson, reviewed by Itagaki Takahiro
2010-08-08 16:27:06 +00:00
Robert Haas 5ffaa9005c Add restart_after_crash GUC.
Normally, we automatically restart after a backend crash, but in some
cases when PostgreSQL is invoked by clusterware it may be desirable to
suppress this behavior, so we provide an option which does this.
Since no existing GUC group quite fits, create a new group called
"error handling options" for this and the previously undocumented GUC
exit_on_error, which is now documented.

Review by Fujii Masao.
2010-07-20 00:47:53 +00:00
Tom Lane 3ec694e17b Add a log_file_mode GUC that allows control of the file permissions set on
log files created by the syslogger process.

In passing, make unix_file_permissions display its value in octal, same
as log_file_mode now does.

Martin Pihlak
2010-07-16 22:25:51 +00:00
Bruce Momjian 239d769e7e pgindent run for 9.0, second run 2010-07-06 19:19:02 +00:00
Robert Haas 243bbe6ed8 Add stray "else" that seems to have gone missing. 2010-06-24 16:40:45 +00:00
Peter Eisentraut 418e1d82fd Refactor sprintf calls with computed format strings into multiple calls with
constant format strings, so that the compiler can more easily check the
formats for correctness.
2010-06-16 00:54:16 +00:00
Peter Eisentraut cb6038c168 Fix some inconsistent quoting of wal_level values in messages
When referring to postgresql.conf syntax, then it's without quotes
(wal_level=archive); in narrative it's with double quotes.  But never
single quotes.
2010-06-03 21:02:12 +00:00
Robert Haas 5e85315ea7 Avoid starting walreceiver in states where it shouldn't be running.
In particular, it's bad to start walreceiver when in state
PM_WAIT_BACKENDS, because we have no provision to kill walreceiver
when in that state.

Fujii Masao
2010-05-27 02:01:37 +00:00
Robert Haas 615704af1e More fixes for shutdown during recovery.
1. If we receive a fast shutdown request while in the PM_STARTUP state,
process it just as we would in PM_RECOVERY, PM_HOT_STANDBY, or PM_RUN.
Without this change, an early fast shutdown followed by Hot Standby causes
the database to get stuck in a state where a shutdown is pending (so no new
connections are allowed) but the shutdown request is never processed unless
we end Hot Standby and enter normal running.

2. Avoid removing the backup label file when a smart or fast shutdown occurs
during recovery.  It makes sense to do this once we've reached normal running,
since we must be taking a backup which now won't be valid.  But during
recovery we must be recovering from a previously taken backup, and any backup
label file is needed to restart recovery from the right place.

Fujii Masao and Robert Haas
2010-05-26 12:32:41 +00:00
Robert Haas ea9968c331 Rename PM_RECOVERY_CONSISTENT and PMSIGNAL_RECOVERY_CONSISTENT.
The new names PM_HOT_STANDBY and PMSIGNAL_BEGIN_HOT_STANDBY more accurately
reflect their actual function.
2010-05-15 20:01:32 +00:00
Robert Haas a724584735 We now accept read-only connections in state PM_RECOVERY_CONSISTENT. 2010-05-14 18:08:33 +00:00
Tom Lane 4a69624f49 Cause the archiver process to adopt new postgresql.conf settings (particularly
archive_command) as soon as possible, namely just before issuing a new call
of archive_command, even when there is a backlog of files to be archived.
The original coding would only absorb new settings after clearing the backlog
and returning to the outer loop.  Per discussion.

Back-patch to 8.3.  The logic in prior versions is a bit different and it
doesn't seem worth taking any risks of breaking it.
2010-05-11 16:42:28 +00:00
Tom Lane 77acab75df Modify ShmemInitStruct and ShmemInitHash to throw errors internally,
rather than returning NULL for some-but-not-all failures as they used to.
Remove now-redundant tests for NULL from call sites.

We had to do something about this because many call sites were failing to
check for NULL; and changing it like this seems a lot more useful and
mistake-proof than adding checks to the call sites without them.
2010-04-28 16:54:16 +00:00
Heikki Linnakangas 9b8a73326e Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.

Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.

Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 16:10:43 +00:00
Heikki Linnakangas 961ad3fdd9 On Windows, syslogger runs in two threads. The main thread processes config
reload and rotation signals, and a helper thread reads messages from the
pipe and writes them to the log file. However, server code isn't generally
thread-safe, so if both try to do e.g palloc()/pfree() at the same time,
bad things will happen. To fix that, use a critical section (which is like
a mutex) to enforce that only one the threads are active at a time.
2010-04-16 09:51:49 +00:00
Robert Haas 1c850fa807 Make smart shutdown work in combination with Hot Standby/Streaming Replication.
At present, killing the startup process does not release any locks it holds,
so we must wait to stop the startup and walreceiver processes until all
read-only backends have exited.  Without this patch, the startup and
walreceiver processes never exit, so the server gets permanently stuck in
a half-shutdown state.

Fujii Masao, with review, docs, and comment adjustments by me.
2010-04-08 01:39:37 +00:00
Heikki Linnakangas 93001dfd18 Don't pass an invalid file handle to dup2(). That causes a crash on
Windows, thanks to a feature in CRT called Parameter Validation.

Backpatch to 8.2, which is the oldest version supported on Windows. In
8.2 and 8.3 also backpatch the earlier change to use DEVNULL instead of
NULL_DEV #define for a /dev/null-like device. NULL_DEV was hard-coded to
"/dev/null" regardless of platform, which didn't work on Windows, while
DEVNULL works on all platforms. Restarting syslogger didn't work on
Windows on versions 8.3 and below because of that.
2010-04-01 20:12:22 +00:00
Simon Riggs 65cd829232 Modify some new and pre-existing messages for translatability. 2010-03-25 20:40:17 +00:00
Tom Lane 223f82d4da Now that we know last_statrequest > last_statwrite can be observed in the
buildfarm, expend a little more effort on the log message for it.
2010-03-24 16:07:10 +00:00
Tom Lane 52e2b33a55 Add some logging code for unexpected cases in pgstat.c, particularly being
unable to read a stats file for reasons other than ENOENT, and having to reset
last_statrequest because it's later than current time in the collector.
Not clear if this will shed any light on the "pgstat wait timeout" business,
but it seems like a good idea in general.

In passing, do some message-style-police work on recently-added
pgstat_reset_shared_counters code.
2010-03-12 22:19:19 +00:00
Bruce Momjian 65e806cba1 pgindent run for 9.0 2010-02-26 02:01:40 +00:00
Robert Haas e26c539e9f Wrap calls to SearchSysCache and related functions using macros.
The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4).  This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.

Design and review by Tom Lane.
2010-02-14 18:42:19 +00:00
Bruce Momjian 4b113d9cdc Document that archive_timeout will force new WAL files even if a single
checkpoint has happened, and recommend adjusting checkpoint_timeout to
reduce the impact of this.
2010-02-05 23:37:43 +00:00
Magnus Hagander f13944e9c9 Make checks for invalid pgStatSock use PGINVALID_SOCKET 2010-01-31 17:39:34 +00:00
Magnus Hagander 083e1b0f27 Add functions to reset the statistics counter for a single table/index or
a single function.
2010-01-28 14:25:41 +00:00
Heikki Linnakangas 1bb2558046 Make standby server continuously retry restoring the next WAL segment with
restore_command, if the connection to the primary server is lost. This
ensures that the standby can recover automatically, if the connection is
lost for a long time and standby falls behind so much that the required
WAL segments have been archived and deleted in the master.

This also makes standby_mode useful without streaming replication; the
server will keep retrying restore_command every few seconds until the
trigger file is found. That's the same basic functionality pg_standby
offers, but without the bells and whistles.

To implement that, refactor the ReadRecord/FetchRecord functions. The
FetchRecord() function introduced in the original streaming replication
patch is removed, and all the retry logic is now in a new function called
XLogReadPage(). XLogReadPage() is now responsible for executing
restore_command, launching walreceiver, and waiting for new WAL to arrive
from primary, as required.

This also changes the life cycle of walreceiver. When launched, it now only
tries to connect to the master once, and exits if the connection fails, or
is lost during streaming for any reason. The startup process detects the
death, and re-launches walreceiver if necessary.
2010-01-27 15:27:51 +00:00
Magnus Hagander 7e40cdc075 Add pg_stat_reset_shared('bgwriter') to reset the cluster-wide shared
statistics of the bgwriter.

Greg Smith
2010-01-19 14:11:32 +00:00
Heikki Linnakangas 40f908bdcd Introduce Streaming Replication.
This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.

Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.

Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.

Fujii Masao, with additional hacking by me
2010-01-15 09:19:10 +00:00
Tom Lane d5e0029862 Add some simple support and documentation for using process-specific oom_adj
settings to prevent the postmaster from being OOM-killed on Linux systems.

Alex Hunsaker and Tom Lane
2010-01-11 18:39:32 +00:00
Magnus Hagander 87091cb1f1 Create typedef pgsocket for storing socket descriptors.
This silences some warnings on Win64. Not using the proper SOCKET datatype
was actually wrong on Win32 as well, but didn't cause any warnings there.

Also create define PGINVALID_SOCKET to indicate an invalid/non-existing
socket, instead of using a hardcoded -1 value.
2010-01-10 14:16:08 +00:00
Bruce Momjian 0239800893 Update copyright for the year 2010. 2010-01-02 16:58:17 +00:00
Magnus Hagander 13c5fdb5c8 Fix one more cast for _open_osfhandle().
Tsutomu Yamada
2010-01-02 12:01:29 +00:00
Tom Lane 48c192c15e Revise pgstat's tracking of tuple changes to improve the reliability of
decisions about when to auto-analyze.

The previous code depended on n_live_tuples + n_dead_tuples - last_anl_tuples,
where all three of these numbers could be bad estimates from ANALYZE itself.
Even worse, in the presence of a steady flow of HOT updates and matching
HOT-tuple reclamations, auto-analyze might never trigger at all, even if all
three numbers are exactly right, because n_dead_tuples could hold steady.

To fix, replace last_anl_tuples with an accurately tracked count of the total
number of committed tuple inserts + updates + deletes since the last ANALYZE
on the table.  This can still be compared to the same threshold as before, but
it's much more trustworthy than the old computation.  Tracking this requires
one more intra-transaction counter per modified table within backends, but no
additional memory space in the stats collector.  There probably isn't any
measurable speed difference; if anything it might be a bit faster than before,
since I was able to eliminate some per-tuple arithmetic operations in favor of
adding sums once per (sub)transaction.

Also, simplify the logic around pgstat vacuum and analyze reporting messages
by not trying to fold VACUUM ANALYZE into a single pgstat message.

The original thought behind this patch was to allow scheduling of analyzes
on parent tables by artificially inflating their changes_since_analyze count.
I've left that for a separate patch since this change seems to stand on its
own merit.
2009-12-30 20:32:14 +00:00
Tom Lane 0b39231431 Avoid memory leak if pgstat_vacuum_stat is interrupted partway through.
The temporary hash tables made by pgstat_collect_oids should be allocated
in a short-term memory context, which is not the default behavior of
hash_create.  Noted while looking through hash_create calls in connection
with Robert Haas' recent complaint.

This is a pre-existing bug, but it doesn't seem important enough to
back-patch.  The hash table is not so large that it would matter unless this
happened many times within a session, which seems quite unlikely.
2009-12-27 19:40:07 +00:00
Simon Riggs efc16ea520 Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.

New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.

This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.

Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.

Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 01:32:45 +00:00
Peter Eisentraut b63b967a7e If there is no sigdelset(), define it as a macro.
This removes some duplicate code that recreated the identical workaround
when the newer signal API is missing.
2009-12-16 22:55:34 +00:00
Tom Lane 8217cfbd99 Add support for an application_name parameter, which is displayed in
pg_stat_activity and recorded in log entries.

Dave Page, reviewed by Andres Freund
2009-11-28 23:38:08 +00:00
Tom Lane b1d55dca91 Fix memory leak in syslogger: logfile_rotate() would leak a copy of the
output filename if CSV logging was enabled and only one of the two possible
output files got rotated during a particular call (which would, in fact,
typically be the case during a size-based rotation).  This would amount to
about MAXPGPATH (1KB) per rotation, and it's been there since the CSV
code was put in, so it's surprising that nobody noticed it before.
Per bug #5196 from Thomas Poindessous.
2009-11-19 02:45:33 +00:00
Tom Lane 5e66a51c2e Provide a parenthesized-options syntax for VACUUM, analogous to that recently
adopted for EXPLAIN.  This will allow additional options to be implemented
in future without having to make them fully-reserved keywords.  The old syntax
remains available for existing options, however.

Itagaki Takahiro
2009-11-16 21:32:07 +00:00
Peter Eisentraut 45d7e04fce reenable -> re-enable
Pointed out by Debian's lintian.
2009-11-05 20:13:06 +00:00
Tom Lane 66a8417f4e Fix an oversight in an 8.3-era patch: pgstat_initstats should allow stats
to be collected for sequences.

Report and fix by Akira Kurosawa
2009-10-02 22:49:50 +00:00
Tom Lane eeb6cb143a Add a boolean GUC parameter "bonjour" to control whether a Bonjour-enabled
build actually attempts to advertise itself via Bonjour.  Formerly it always
did so, which meant that packagers had to decide for their users whether
this behavior was wanted or not.  The default is "off" to be on the safe
side, though this represents a change in the default behavior of a
Bonjour-enabled build.  Per discussion.
2009-09-08 17:08:36 +00:00
Tom Lane 59b9f3d36d Replace use of the long-deprecated Bonjour API DNSServiceRegistrationCreate
with the not-so-deprecated DNSServiceRegister.  This patch shouldn't change
any user-visible behavior, it just gets rid of a deprecation warning in
--with-bonjour builds.  The new code will fail on OS X releases before 10.3,
but it seems unlikely that anyone will want to run Postgres 8.5 on 10.2.
2009-09-08 16:08:26 +00:00
Tom Lane 47ef623c0b Remove pgstat's discrimination against MsgVacuum and MsgAnalyze messages.
Formerly, these message types would be discarded unless there was already
a stats hash table entry for the target table.  However, the intent of
saving hash table space for unused tables was subverted by the fact that
the physical I/O done by the vacuum or analyze would result in an immediately
following tabstat message, which would create the hash table entry anyway.
All that we had left was surprising loss of statistical data, as in a recent
complaint from Jaime Casanova.

It seems unlikely that a real database would have many tables that go totally
untouched over the long haul, so the consensus is that this "optimization"
serves little purpose anyhow.  Remove it, and just create the hash table
entry on demand in all cases.
2009-09-04 22:32:33 +00:00
Tom Lane 00e6a16d01 Change the autovacuum launcher to read pg_database directly, rather than
via the "flat files" facility.  This requires making it enough like a backend
to be able to run transactions; it's no longer an "auxiliary process" but
more like the autovacuum worker processes.  Also, its signal handling has
to be brought into line with backends/workers.  In particular, since it
now has to handle procsignal.c processing, the special autovac-launcher-only
signal conditions are moved to SIGUSR2.

Alvaro, with some cleanup from Tom
2009-08-31 19:41:00 +00:00
Tom Lane e710b65c1c Remove the use of the pg_auth flat file for client authentication.
(That flat file is now completely useless, but removal will come later.)

To do this, postpone client authentication into the startup transaction
that's run by InitPostgres.  We still collect the startup packet and do
SSL initialization (if needed) at the same time we did before.  The
AuthenticationTimeout is applied separately to startup packet collection
and the actual authentication cycle.  (This is a bit annoying, since it
means a couple extra syscalls; but the signal handling requirements inside
and outside a transaction are sufficiently different that it seems best
to treat the timeouts as completely independent.)

A small security disadvantage is that if the given database name is invalid,
this will be reported to the client before any authentication happens.
We could work around that by connecting to database "postgres" instead,
but consensus seems to be that it's not worth introducing such surprising
behavior.

Processing of all command-line switches and GUC options received from the
client is now postponed until after authentication.  This means that
PostAuthDelay is much less useful than it used to be --- if you need to
investigate problems during InitPostgres you'll have to set PreAuthDelay
instead.  However, allowing an unauthenticated user to set any GUC options
whatever seems a bit too risky, so we'll live with that.
2009-08-29 19:26:52 +00:00
Tom Lane 0a00c9a8ef Remove useless code that propagated FrontendProtocol to a backend via a
PostgresMain switch.  In point of fact, FrontendProtocol is already set
in a backend process, since ProcessStartupPacket() is executed inside
the backend --- it hasn't been run by the postmaster for many years.
And if it were, we'd still certainly want FrontendProtocol to be set before
we get as far as PostgresMain, so that startup errors get reported in the
right protocol.

-v might have some future use in standalone backends, so I didn't go so
far as to remove the switch outright.

Also, initialize FrontendProtocol to 0 not PG_PROTOCOL_LATEST.  The only
likely result of presetting it like that is to mask failure-to-set-it
mistakes.
2009-08-28 18:23:53 +00:00
Tom Lane c66d9ce774 Non-Windows EXEC_BACKEND path was broken by recent write_inheritable_socket
change ... it's got to return true.
2009-08-28 17:42:54 +00:00
Alvaro Herrera 53af86c55c Fix handling of autovacuum reloptions.
In the original coding, setting a single reloption would cause default
values to be used for all the other reloptions.  This is a problem
particularly for autovacuum reloptions.

Itagaki Takahiro
2009-08-27 17:18:44 +00:00
Tom Lane 8bed238c87 Try to make silent_mode behave somewhat reasonably.
Instead of sending stdout/stderr to /dev/null after forking away from the
terminal, send them to postmaster.log within the data directory.  Since
this opens the door to indefinite logfile bloat, recommend even more
strongly that log output be redirected when using silent_mode.

Move the postmaster's initial calls of load_hba() and load_ident() down
to after we have started the log collector, if we are going to.  This
is so that errors reported by them will appear in the "usual" place.

Reclassify silent_mode as a LOGGING_WHERE, not LOGGING_WHEN, parameter,
since it's got absolutely nothing to do with the latter category.

In passing, fix some obsolete references to -S ... this option hasn't
had that switch letter for a long time.

Back-patch to 8.4, since as of 8.4 load_hba() and load_ident() are more
picky (and thus more likely to fail) than they used to be.  This entire
change was driven by a complaint about those errors disappearing into
the bit bucket.
2009-08-24 20:08:32 +00:00
Tom Lane 5a4f763841 Small correction to previous patch: we shouldn't ReleasePostmasterChildSlot
for a dead_end child, because we didn't AssignPostmasterChildSlot.
2009-08-24 18:09:37 +00:00
Alvaro Herrera 45f9b4646f Avoid calling kill() in a postmaster signal handler.
This causes problems when the system load is high, per report from Zdenek
Kotala in <1250860954.1239.114.camel@localhost>; instead of calling kill
directly, have the signal handler set a flag which is checked in ServerLoop.
This way, the handler can return before being called again by a subsequent
signal sent from the autovacuum launcher.  Also, increase the sleep in the
launcher in this failure path to 1 second.

Backpatch to 8.3, which is when the signalling between autovacuum
launcher/postmaster was introduced.

Also, add a couple of ReleasePostmasterChildSlot calls in error paths; this
part backpatched to 8.4 which is when the child slot stuff was introduced.
2009-08-24 17:23:02 +00:00
Tom Lane 04011cc970 Allow backends to start up without use of the flat-file copy of pg_database.
To make this work in the base case, pg_database now has a nailed-in-cache
relation descriptor that is initialized using hardwired knowledge in
relcache.c.  This means pg_database is added to the set of relations that
need to have a Schema_pg_xxx macro maintained in pg_attribute.h.  When this
path is taken, we'll have to do a seqscan of pg_database to find the row
we need.

In the normal case, we are able to do an indexscan to find the database's row
by name.  This is made possible by storing a global relcache init file that
describes only the shared catalogs and their indexes (and therefore is usable
by all backends in any database).  A new backend loads this cache file,
finds its database OID after an indexscan on pg_database, and then loads
the local relcache init file for that database.

This change should effectively eliminate number of databases as a factor
in backend startup time, even with large numbers of databases.  However,
the real reason for doing it is as a first step towards getting rid of
the flat files altogether.  There are still several other sub-projects
to be tackled before that can happen.
2009-08-12 20:53:31 +00:00
Heikki Linnakangas 06f1f53ea9 Fast shutdown stop should forcibly disconnect any active backends, even
if a smart shutdown is already in progress. Backpatch to 8.3, this was broken
in the patch that introduced "dead-end backends".

Per report by Itagaki Takahiro, patch by Fujii Masao.
2009-08-07 05:58:55 +00:00
Magnus Hagander 4000170535 Avoid terminating the postmaster on a number of "can't happen" cases during
backend startup on Win32. Instead, log the error and just forget about
the potentially dangling process, since we can't do anything about it anyway.
2009-08-06 09:50:22 +00:00
Tom Lane 2487d872e0 Create a multiplexing structure for signals to Postgres child processes.
This patch gets us out from under the Unix limitation of two user-defined
signal types.  We already had done something similar for signals directed to
the postmaster process; this adds multiplexing for signals directed to
backends and auxiliary processes (so long as they're connected to shared
memory).

As proof of concept, replace the former usage of SIGUSR1 and SIGUSR2
for backends with use of the multiplexing mechanism.  There are still some
hard-wired definitions of SIGUSR1 and SIGUSR2 for other process types,
but getting rid of those doesn't seem interesting at the moment.

Fujii Masao
2009-07-31 20:26:23 +00:00
Magnus Hagander a7e587863c Reserve the shared memory region during backend startup on Windows, so
that memory allocated by starting third party DLLs doesn't end up
conflicting with it.

Hopefully this solves the long-time issue with "could not reattach
to shared memory" errors on Win32.

Patch from Tsutomu Yamada and me, based on idea from Trevor Talbot.
2009-07-24 20:12:42 +00:00
Tom Lane b11ce5608a Remove no-longer-necessary transmission of postmaster's LC_COLLATE and
LC_CTYPE settings to children via BackendParameters.  Per discussion,
the postmaster is now just using system defaults anyway, so we might as
well save a few cycles during backend startup.
2009-07-08 18:55:35 +00:00
Tom Lane 2de48a83e6 Cleanup and code review for the patch that made bgwriter active during
archive recovery.  Invent a separate state variable and inquiry function
for XLogInsertAllowed() to clarify some tests and make the management of
writing the end-of-recovery checkpoint less klugy.  Fix several places
that were incorrectly testing InRecovery when they should be looking at
RecoveryInProgress or XLogInsertAllowed (because they will now be executed
in the bgwriter not startup process).  Clarify handling of bad LSNs passed
to XLogFlush during recovery.  Use a spinlock for setting/testing
SharedRecoveryInProgress.  Improve quite a lot of comments.

Heikki and Tom
2009-06-26 20:29:04 +00:00
Heikki Linnakangas 7e48b77b1c Fix some serious bugs in archive recovery, now that bgwriter is active
during it:

When bgwriter is active, the startup process can't perform mdsync() correctly
because it won't see the fsync requests accumulated in bgwriter's private
pendingOpsTable. Therefore make bgwriter responsible for the end-of-recovery
checkpoint as well, when it's active.

When bgwriter is active (= archive recovery), the startup process must not
accumulate fsync requests to its own pendingOpsTable, since bgwriter won't
see them there when it performs restartpoints. Make startup process drop its
pendingOpsTable when bgwriter is launched to avoid that.

Update minimum recovery point one last time when leaving archive recovery.
It won't be updated by the end-of-recovery checkpoint because XLogFlush()
sees us as out of recovery already.

This fixes bug #4879 reported by Fujii Masao.
2009-06-25 21:36:00 +00:00
Tom Lane bfd06a713b Fix several places where a function was declared static and then defined
without static.  Per testing with a compiler that complains about this.
2009-06-12 16:17:29 +00:00
Bruce Momjian d747140279 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list
provided by Andrew.
2009-06-11 14:49:15 +00:00
Alvaro Herrera e66576e58c Fix typo, per Tom 2009-06-09 19:36:28 +00:00
Alvaro Herrera e8f28cb25d Dynamically set a lower bound on autovacuum nap time so that we don't rebuild
the database list too often.

Per bug report from Łukasz Jagiełło and ensuing discussion on
pgsql-performance.
2009-06-09 16:41:02 +00:00
Tom Lane 32ea236361 Improve the IndexVacuumInfo/IndexBulkDeleteResult API to allow somewhat sane
behavior in cases where we don't know the heap tuple count accurately; in
particular partial vacuum, but this also makes the API a bit more useful
for ANALYZE.  This patch adds "estimated_count" flags to both structs so
that an approximate count can be flagged as such, and adjusts the logic
so that approximate counts are not used for updating pg_class.reltuples.

This fixes my previous complaint that VACUUM was putting ridiculous values
into pg_class.reltuples for indexes.  The actual impact of that bug is
limited, because the planner only pays attention to reltuples for an index
if the index is partial; which probably explains why beta testers hadn't
noticed a degradation in plan quality from it.  But it needs to be fixed.

The whole thing is a bit messy and should be redesigned in future, because
reltuples now has the potential to drift quite far away from reality when
a long period elapses with no non-partial vacuums.  But this is as good as
it's going to get for 8.4.
2009-06-06 22:13:52 +00:00
Tom Lane 76d4abf2d9 Improve the recently-added support for properly pluralized error messages
by extending the ereport() API to cater for pluralization directly.  This
is better than the original method of calling ngettext outside the elog.c
code because (1) it avoids double translation, which wastes cycles and in
the worst case could give a wrong result; and (2) it avoids having to use
a different coding method in PL code than in the core backend.  The
client-side uses of ngettext are not touched since neither of these concerns
is very pressing in the client environment.  Per my proposal of yesterday.
2009-06-04 18:33:08 +00:00
Tom Lane 4616d57dad Fix all the server-side SIGQUIT handlers (grumble ... why so many identical
copies?) to ensure they really don't run proc_exit/shmem_exit callbacks,
as was intended.  I broke this behavior recently by installing atexit
callbacks without thinking about the one case where we truly don't want
to run those callback functions.  Noted in an example from Dave Page.
2009-05-15 15:56:39 +00:00
Tom Lane 969d7cd431 Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory.  We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead.  Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft).  If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.

The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway.  This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.

This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.

Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 19:59:00 +00:00
Tom Lane 4071e0c242 Fix missed usage of DLNewElem() 2009-05-04 02:46:36 +00:00
Alvaro Herrera a1e1ef4f77 Avoid a memory allocation in the backend startup code, to avoid having to check
whether it failed.  Modelled after catcache.c's usage of DlList, per suggestion
from Tom.
2009-05-04 02:24:17 +00:00
Tom Lane d90984f4f6 Install some simple defenses in postmaster startup to help ensure a useful
error message if the installation directory layout is messed up (or at least,
something more useful than the behavior exhibited in bug #4787).  During
postmaster startup, check that get_pkglib_path resolves as a readable
directory; and if ParseTzFile() fails to open the expected timezone
abbreviation file, check the possibility that the directory is missing rather
than just the specified file.  In case of either failure, issue a hint
suggesting that the installation is broken.  These two checks cover the lib/
and share/ trees of a full installation, which should take care of most
scenarios where a sysadmin decides to get cute.
2009-05-02 22:02:37 +00:00
Tom Lane 27fbfd396c Remove a boatload of useless definitions of 'int optreset'. If we
are using our own ports of getopt or getopt_long, those will define
the variable for themselves; and if not, we don't need these, because
we never touch the variable anyway.
2009-04-05 04:19:59 +00:00
Tom Lane 948d6ec90f Modify the relcache to record the temp status of both local and nonlocal
temp relations; this is no more expensive than before, now that we have
pg_class.relistemp.  Insert tests into bufmgr.c to prevent attempting
to fetch pages from nonlocal temp relations.  This provides a low-level
defense against bugs-of-omission allowing temp pages to be loaded into shared
buffers, as in the contrib/pgstattuple problem reported by Stuart Bishop.
While at it, tweak a bunch of places to use new relcache tests (instead of
expensive probes into pg_namespace) to detect local or nonlocal temp tables.
2009-03-31 22:12:48 +00:00
Peter Eisentraut 8032d76b5b Gettext plural support
In the backend, I changed only a handful of exemplary or important-looking
instances to make use of the plural support; there is probably more work
there.  For the rest of the source, this should cover all relevant cases.
2009-03-26 22:26:08 +00:00
Heikki Linnakangas 8e1a8fe288 Fix Windows-specific race condition in syslogger. This could've been
the cause of the "could not write to log file: Bad file descriptor"
errors reported at
http://archives.postgresql.org//pgsql-general/2008-06/msg00193.php

Backpatch to 8.3, the race condition was introduced by the CSV logging
patch.

Analysis and patch by Gurjeet Singh.
2009-03-18 08:44:49 +00:00
Heikki Linnakangas fb7df896fc Reload config file in startup process on SIGHUP.
Fujii Masao
2009-03-04 13:56:40 +00:00
Heikki Linnakangas fd75329e81 Fix copy-pasto in the patch to allow background writer to run during
recovery: if background writer or pgstat process dies during recovery (or
any other child process, but those two are the only ones running), send
SIGQUIT to the startup process using correct pid.
2009-03-03 10:42:05 +00:00
Heikki Linnakangas 0160cebee9 Put back a "continue" that went missing in the changes to start background
writer in WAL recovery.
2009-02-25 11:07:43 +00:00
Peter Eisentraut 7380b6384b Don't append epoch to log_filename if no format specifier is given.
Robert Haas
2009-02-24 12:09:09 +00:00
Heikki Linnakangas bc134d7a51 Change the signaling of end-of-recovery. Startup process now indicates end
of recovery by exiting with exit code 0, like in previous releases. Per
Tom's suggestion.
2009-02-23 09:28:50 +00:00
Heikki Linnakangas 5717f3a3e6 Fix bogus comment, from the patch to start bgwriter during archive
recovery.
2009-02-19 16:43:13 +00:00
Heikki Linnakangas cdd46c7654 Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.

This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.

The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.

Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.

log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.

This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.

Simon Riggs, with lots of modifications by me.
2009-02-18 15:58:41 +00:00
Alvaro Herrera 834a6da4f7 Update autovacuum to use reloptions instead of a system catalog, for
per-table overrides of parameters.

This removes a whole class of problems related to misusing the catalog,
and perhaps more importantly, gives us pg_dump support for the parameters.

Based on a patch by Euler Taveira de Oliveira, heavily reworked by me.
2009-02-09 20:57:59 +00:00
Tom Lane 623cf5edec Add a failure check for syslogger's use of _beginthreadex(), and remove
unnecessary thread address output parameter, to make this code look more
like that in pg_restore.
2009-02-03 00:59:26 +00:00
Heikki Linnakangas 6587818542 Add vacuum_freeze_table_age GUC option, to control when VACUUM should
ignore the visibility map and scan the whole table, to advance
relfrozenxid.
2009-01-16 13:27:24 +00:00
Tom Lane 7466eeac61 Add contrib/pg_stat_statements for server-wide tracking of statement execution
statistics.

Takahiro Itagaki
2009-01-04 22:19:59 +00:00
Tom Lane dad75a62bf Create a "shmem_startup_hook" to be called at the end of shared memory
initialization, to give loadable modules a reasonable place to perform
creation of any shared memory areas they need.  This is the logical conclusion
of our previous creation of RequestAddinShmemSpace() and RequestAddinLWLocks().
We don't need an explicit shmem_shutdown_hook, because the existing
on_shmem_exit and on_proc_exit mechanisms serve that need.

Also, adjust SubPostmasterMain so that libraries that got loaded into the
postmaster will be loaded into all child processes, not only regular backends.
This improves consistency with the non-EXEC_BACKEND behavior, and might be
necessary for functionality for some types of add-ons.
2009-01-03 17:08:39 +00:00
Bruce Momjian 511db38ace Update copyright for 2009. 2009-01-01 17:24:05 +00:00
Heikki Linnakangas dcf8409985 Don't reset pg_class.reltuples and relpages in VACUUM, if any pages were
skipped. We could update relpages anyway, but it seems better to only
update it together with reltuples, because we use the reltuples/relpages
ratio in the planner. Also don't update n_live_tuples in pgstat.

ANALYZE in VACUUM ANALYZE now needs to update pg_class, if the
VACUUM-phase didn't do so. Added some boolean-passing to let analyze_rel
know if it should update pg_class or not.

I also moved the relcache invalidation (to update rd_targblock) from
vac_update_relstats to where RelationTruncate is called, because
vac_update_relstats is not called for partial vacuums anymore. It's more
obvious to send the invalidation close to the truncation that requires it.

Per report by Ned T. Crigler.
2008-12-17 09:15:03 +00:00
Peter Eisentraut d9346f2186 The macros NULL_DEV and DEVNULL were both used to work around
platform-specific spellings of /dev/null.  But one should be enough, so
settle on DEVNULL.
2008-12-11 10:25:17 +00:00
Heikki Linnakangas dea81a6cf6 Revert SIGUSR1 multiplexing patch, per Tom's objection. 2008-12-09 15:59:39 +00:00
Heikki Linnakangas 7b05b3fa39 Provide support for multiplexing SIGUSR1 signal. The upcoming synchronous
replication patch needs a signal, but we've already used SIGUSR1 and
SIGUSR2 in normal backends. This patch allows reusing SIGUSR1 for that,
and for other purposes too if the need arises.
2008-12-09 14:28:20 +00:00
Tom Lane 4e0b63b0b9 Teach pgstat_vacuum_stat to not bother scanning pg_proc in the common case
where no function stats entries exist.  Partial response to Pavel's
observation that small VACUUM operations are noticeably slower in CVS HEAD
than 8.3.
2008-12-08 15:44:54 +00:00
Heikki Linnakangas 7537f52a00 Utilize the visibility map in autovacuum, too. There was an oversight in
the visibility map patch that because autovacuum always sets
VacuumStmt->freeze_min_age, visibility map was never used for autovacuum,
only for manually launched vacuums. This patch introduces a new scan_all
field to VacuumStmt, indicating explicitly whether the visibility map
should be used, or the whole relation should be scanned, to advance
relfrozenxid. Anti-wraparound vacuums still need to scan all pages.
2008-12-04 11:42:24 +00:00
Tom Lane 6f6a6d8b14 Teach RequestCheckpoint() to wait and retry a few times if it can't signal
the bgwriter immediately.  This covers the case where the bgwriter is still
starting up, as seen in a recent buildfarm failure.  In future it might also
assist with clean recovery after a bgwriter termination and restart ---
right now the postmaster treats early bgwriter exit as a system crash,
but that might not always be so.
2008-11-23 01:40:19 +00:00
Heikki Linnakangas 4c22564471 Fix off-by-one error in autovacuum shmem struct sizing. This could lead to
autovacuum worker sending SIGUSR1 signal to wrong process, per Zou Yong's
report.

Backpatch to 8.3.
2008-11-12 10:10:32 +00:00
Peter Eisentraut 9beb9e761b Fix compiler warning about uninitialized variable 2008-11-04 11:04:06 +00:00
Alvaro Herrera 88dd4b0a0d Reduce the acceptable staleness of pgstat data for autovacuum, per the
longstanding note in the source that this patch removes.
2008-11-03 19:03:41 +00:00
Tom Lane 3c2313f481 Change the pgstat logic so that the stats collector writes the stats file only
upon requests from backends, rather than on a fixed 500msec cycle.  (There's
still throttling logic to ensure it writes no more often than once per
500msec, though.)  This should result in a significant reduction in stats file
write traffic in typical scenarios where the stats are demanded only
infrequently.

This approach also means that the former difficulty with changing
stats_temp_directory on-the-fly has gone away, so remove the caution about
that as well as the thrashing we did to minimize the trouble window.

In passing, also fix pgstat_report_stat() so that we will send a stats
message if we have function call stats but not table stats to report;
this fixes a bug in the recent patch to support function-call stats.

Martin Pihlak
2008-11-03 01:17:08 +00:00
Tom Lane d7112cfa88 Remove the last vestiges of the MAKE_PTR/MAKE_OFFSET mechanism. We haven't
allowed different processes to have different addresses for the shmem segment
in quite a long time, but there were still a few places left that used the
old coding convention.  Clean them up to reduce confusion and improve the
compiler's ability to detect pointer type mismatches.

Kris Jurka
2008-11-02 21:24:52 +00:00
Magnus Hagander 53a5026b5c Remove support for (insecure) crypt authentication.
This breaks compatibility with pre-7.2 versions.
2008-10-28 12:10:44 +00:00
Heikki Linnakangas 84c3769482 Fix oversight in the relation forks patch: forgot to copy fork number to
fsync requests. This should fix the installcheck failure of the buildfarm
member "kudu".
2008-10-14 08:06:39 +00:00
Heikki Linnakangas 15c121b3ed Rewrite the FSM. Instead of relying on a fixed-size shared memory segment, the
free space information is stored in a dedicated FSM relation fork, with each
relation (except for hash indexes; they don't use FSM).

This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any
trace of them from the backend, initdb, and documentation.

Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also
introduce a new variant of the get_raw_page(regclass, int4, int4) function in
contrib/pageinspect that let's you to return pages from any relation fork, and
a new fsm_page_contents() function to inspect the new FSM pages.
2008-09-30 10:52:14 +00:00
Bruce Momjian 5f7b25d5d5 Add comment about the use of EXEC_BACKEND. 2008-09-23 20:35:38 +00:00
Heikki Linnakangas 61d9674988 Make LC_COLLATE and LC_CTYPE database-level settings. Collation and
ctype are now more like encoding, stored in new datcollate and datctype
columns in pg_database.

This is a stripped-down version of Radek Strnad's patch, with further
changes by me.
2008-09-23 09:20:39 +00:00
Magnus Hagander 9872381090 Parse pg_hba.conf in postmaster, instead of once in each backend for
each connection. This makes it possible to catch errors in the pg_hba
file when it's being reloaded, instead of silently reloading a broken
file and failing only when a user tries to connect.

This patch also makes the "sameuser" argument to ident authentication
optional.
2008-09-15 12:32:57 +00:00
Magnus Hagander f1e237b6b2 Unconditionally write the statsfile when SIGHUP is received, to minimize
the window during which backends have no statistics file to read.
2008-08-25 18:55:43 +00:00
Magnus Hagander be8d6c5c34 Make stats_temp_directory PGC_SIGHUP, and document how it may cause a temporary
"outage" of the statistics views.

This requires making the stats collector respond to SIGHUP, like the other
utility processes already did.
2008-08-25 15:11:01 +00:00
Magnus Hagander 5b8eb2b4b9 Make the temporary directory for pgstat files configurable by the GUC
variable stats_temp_directory, instead of requiring the admin to
mount/symlink the pg_stat_tmp directory manually.

For now the config variable is PGC_POSTMASTER. Room for further improvment
that would allow it to be changed on-the-fly.
2008-08-15 08:37:41 +00:00
Alvaro Herrera 3ccde312ec Have autovacuum consider processing TOAST tables separately from their
main tables.

This requires vacuum() to accept processing a toast table standalone, so
there's a user-visible change in that it's now possible (for a superuser) to
execute "VACUUM pg_toast.pg_toast_XXX".
2008-08-13 00:07:50 +00:00
Heikki Linnakangas 3f0e808c4a Introduce the concept of relation forks. An smgr relation can now consist
of multiple forks, and each fork can be created and grown separately.

The bulk of this patch is about changing the smgr API to include an extra
ForkNumber argument in every smgr function. Also, smgrscheduleunlink and
smgrdounlink no longer implicitly call smgrclose, because other forks might
still exist after unlinking one. The callers of those functions have been
modified to call smgrclose instead.

This patch in itself doesn't have any user-visible effect, but provides the
infrastructure needed for upcoming patches. The additional forks envisioned
are a rewritten FSM implementation that doesn't rely on a fixed-size shared
memory block, and a visibility map to allow skipping portions of a table in
VACUUM that have no dead tuples.
2008-08-11 11:05:11 +00:00
Magnus Hagander 70d756970b Move pgstat.tmp into a temporary directory under $PGDATA named pg_stat_tmp.
This allows the use of a ramdrive (either through mount or symlink) for
the temporary file that's written every half second, which should
reduce I/O.

On server shutdown/startup, the file is written to the old location in
the global directory, to preserve data across restarts.

Bump catversion since the $PGDATA directory layout changed.
2008-08-05 12:09:30 +00:00
Alvaro Herrera e36e6b1cab Add a few more DTrace probes to the backend.
Robert Lor
2008-08-01 13:16:09 +00:00
Alvaro Herrera 85dfe376d9 Ratchet up patch to improve autovacuum wraparound messages.
Simon Riggs
2008-07-23 20:20:10 +00:00
Alvaro Herrera 0d09688f88 Publish more openly the fact that autovacuum is working for wraparound
protection.

Simon Riggs
2008-07-21 15:27:02 +00:00
Alvaro Herrera 46c5a212ec Avoid crashing when a table is deleted while we're on the process of checking
it.

Per report from Tom Lane based on buildfarm evidence.
2008-07-17 21:02:31 +00:00
Tom Lane 5b965bf08b Teach autovacuum how to determine whether a temp table belongs to a crashed
backend.  If so, send a LOG message to the postmaster log, and if the table
is beyond the vacuum-for-wraparound horizon, forcibly drop it.  Per recent
discussions.  Perhaps we ought to back-patch this, but it probably needs
to age a bit in HEAD first.
2008-07-01 02:09:34 +00:00
Heikki Linnakangas 995fb74202 Turn PGBE_ACTIVITY_SIZE into a GUC variable, track_activity_query_size.
As the buffer could now be a lot larger than before, and copying it could
thus be a lot more expensive than before, use strcpy instead of memcpy to
copy the query string, as was already suggested in comments. Also, only copy
the PgBackendStatus struct and string if the slot is in use.

Patch by Thomas Lee, with some changes by me.
2008-06-30 10:58:47 +00:00
Bruce Momjian 067f1e5fa8 Fix 'pg_ctl restart' to preserve command-line arguments. 2008-06-26 02:47:19 +00:00
Bruce Momjian a1183238be Use SYSTEMQUOTE as concatentation to strings, rather than %s printf
patterns, for clarity.
2008-06-26 01:35:45 +00:00
Tom Lane fad153ec45 Rewrite the sinval messaging mechanism to reduce contention and avoid
unnecessary cache resets.  The major changes are:

* When the queue overflows, we only issue a cache reset to the specific
backend or backends that still haven't read the oldest message, rather
than resetting everyone as in the original coding.

* When we observe backend(s) falling well behind, we signal SIGUSR1
to only one backend, the one that is furthest behind and doesn't already
have a signal outstanding for it.  When it finishes catching up, it will
in turn signal SIGUSR1 to the next-furthest-back guy, if there is one that
is far enough behind to justify a signal.  The PMSIGNAL_WAKEN_CHILDREN
mechanism is removed.

* We don't attempt to clean out dead messages after every message-receipt
operation; rather, we do it on the insertion side, and only when the queue
fullness passes certain thresholds.

* Split SInvalLock into SInvalReadLock and SInvalWriteLock so that readers
don't block writers nor vice versa (except during the infrequent queue
cleanout operations).

* Transfer multiple sinval messages for each acquisition of a read or
write lock.
2008-06-19 21:32:56 +00:00
Alvaro Herrera a3540b0f65 Improve our #include situation by moving pointer types away from the
corresponding struct definitions.  This allows other headers to avoid including
certain highly-loaded headers such as rel.h and relscan.h, instead using just
relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less
unnecessary dependencies.
2008-06-19 00:46:06 +00:00
Alvaro Herrera e4ca6cac43 Change xlog.h to xlogdefs.h in bufpage.h, and fix fallout. 2008-06-06 22:35:22 +00:00
Alvaro Herrera 9319fd89e1 Modify vacuum() to accept a single relation OID instead of a list (which we
always pass as a single element anyway.)  In passing, fix an outdated comment.
2008-06-05 15:47:32 +00:00
Tom Lane 93c701edc6 Add support for tracking call counts and elapsed runtime for user-defined
functions.

Note that because this patch changes FmgrInfo, any external C functions
you might be testing with 8.4 will need to be recompiled.

Patch by Martin Pihlak, some editorialization by me (principally, removing
tracking of getrusage() numbers)
2008-05-15 00:17:41 +00:00
Alvaro Herrera f8c4d7db60 Restructure some header files a bit, in particular heapam.h, by removing some
unnecessary #include lines in it.  Also, move some tuple routine prototypes and
macros to htup.h, which allows removal of heapam.h inclusion from some .c
files.

For this to work, a new header file access/sysattr.h needed to be created,
initially containing attribute numbers of system columns, for pg_dump usage.

While at it, make contrib ltree, intarray and hstore header files more
consistent with our header style.
2008-05-12 00:00:54 +00:00
Tom Lane 600da67fbe Add pg_conf_load_time() function to report when the Postgres configuration
files were last loaded.

George Gensure
2008-05-04 21:13:36 +00:00
Tom Lane ea0382e370 Code review for recent patch to terminate online backup during shutdown:
do CancelBackup at a sane place, fix some oversights in the state transitions,
allow only superusers to connect while we are waiting for backup mode to end.
2008-04-26 22:47:40 +00:00
Magnus Hagander c979a1fefa Prevent shutdown in normal mode if online backup is running, and
have pg_ctl warn about this.

Cancel running online backups (by renaming the backup_label file,
thus rendering the backup useless) when shutting down in fast mode.

Laurenz Albe
2008-04-23 13:44:59 +00:00
Tom Lane 51e1445f10 Teach ANALYZE to distinguish dead and in-doubt tuples, which it formerly
classed all as "dead"; also get it to count DEAD item pointers as dead rows,
instead of ignoring them as before.  Also improve matters so that tuples
previously inserted or deleted by our own transaction are handled nicely:
the stats collector's live-tuple and dead-tuple counts will end up correct
after our transaction ends, regardless of whether we end in commit or abort.

While there's more work that could be done to improve the counting of in-doubt
tuples in both VACUUM and ANALYZE, this commit is enough to alleviate some
known bad behaviors in 8.3; and the other stuff that's been discussed seems
like research projects anyway.

Pavan Deolasee and Tom Lane
2008-04-03 16:27:25 +00:00
Tom Lane 3405f2b925 Use error message wordings for permissions checks on .pgpass and SSL private
key files that are similar to the one for the postmaster's data directory
permissions check.  (I chose to standardize on that one since it's the most
heavily used and presumably best-wordsmithed by now.)  Also eliminate explicit
tests on file ownership in these places, since the ensuing read attempt must
fail anyway if it's wrong, and there seems no value in issuing the same error
message for distinct problems.  (But I left in the explicit ownership test in
postmaster.c, since it had its own error message anyway.)  Also be more
specific in the documentation's descriptions of these checks.  Per a gripe
from Kevin Hunter.
2008-03-31 02:43:14 +00:00
Alvaro Herrera 73b0300b2a Move the HTSU_Result enum definition into snapshot.h, to avoid including
tqual.h into heapam.h.  This makes all inclusion of tqual.h explicit.

I also sorted alphabetically the includes on some source files.
2008-03-26 21:10:39 +00:00
Tom Lane 9b8e1eb375 Adjust the recent patch for reporting of deadlocked queries so that we report
query texts only to the server log.  This eliminates the issue of possible
leaking of security-sensitive data in other sessions' queries.  Since the
log is presumed secure, we can now log the queries of all sessions involved
in the deadlock, whether or not they belong to the same user as the one
reporting the failure.
2008-03-24 18:22:36 +00:00
Tom Lane 4b7ae4afae Report the current queries of all backends involved in a deadlock
(if they'd be visible to the current user in pg_stat_activity).

This might look like it's subject to race conditions, but it's actually
pretty safe because at the time DeadLockReport() is constructing the
report, we haven't yet aborted our transaction and so we can expect that
everyone else involved in the deadlock is still blocked on some lock.
(There are corner cases where that might not be true, such as a statement
timeout triggering in another backend before we finish reporting; but at
worst we'd report a misleading activity string, so it seems acceptable
considering the usefulness of reporting the queries.)

Original patch by Itagaki Takahiro, heavily modified by me.
2008-03-21 21:08:31 +00:00
Alvaro Herrera 470c6c12a1 Remove another useless snapshot creation. 2008-03-19 21:14:20 +00:00
Tom Lane 4873c96ff3 Fix inappropriately-timed memory context switch in autovacuum_do_vac_analyze.
This accidentally failed to fail before 8.3, because the context we were
switching back to was long-lived anyway; but it sure looks risky as can be
now.  Well spotted by Pavan Deolasee.
2008-03-14 23:49:28 +00:00
Alvaro Herrera adc4e1e635 Fix vacuum so that autovacuum is really not cancelled when doing an emergency
job (i.e. to prevent Xid wraparound problems.)  Bug reported by ITAGAKI
Takahiro in 20080314103837.63D3.52131E4D@oss.ntt.co.jp, though I didn't use his
patch.
2008-03-14 17:25:59 +00:00
Tom Lane d9384a4b73 Remove postmaster.c's check that NBuffers is at least twice MaxBackends.
With the addition of multiple autovacuum workers, our choices were to delete
the check, document the interaction with autovacuum_max_workers, or complicate
the check to try to hide that interaction.  Since this restriction has never
been adequate to ensure backends can't run out of pinnable buffers, it doesn't
really have enough excuse to live to justify the second or third choices.
Per discussion of a complaint from Andreas Kling (see also bug #3888).

This commit also removes several documentation references to this restriction,
but I'm not sure I got them all.
2008-03-09 04:56:28 +00:00
Tom Lane 870993e871 Rename miscadmin.h's PG_VERSIONSTR macro to PG_BACKEND_VERSIONSTR to
make it a bit clearer what it is, and get rid of duplicate definitions
in initdb and pg_ctl.
2008-02-20 22:46:24 +00:00
Alvaro Herrera bccc8e3608 Change error message to be able to differentiate the two cases. Per suggestion
from Jaime Casanova.
2008-02-20 14:01:45 +00:00
Peter Eisentraut 0474dcb608 Refactor backend makefiles to remove lots of duplicate code 2008-02-19 10:30:09 +00:00
Tom Lane cd00406774 Replace time_t with pg_time_t (same values, but always int64) in on-disk
data structures and backend internal APIs.  This solves problems we've seen
recently with inconsistent layout of pg_control between machines that have
32-bit time_t and those that have already migrated to 64-bit time_t.  Also,
we can get out from under the problem that Windows' Unix-API emulation is not
consistent about the width of time_t.

There are a few remaining places where local time_t variables are used to hold
the current or recent result of time(NULL).  I didn't bother changing these
since they do not affect any cross-module APIs and surely all platforms will
have 64-bit time_t before overflow becomes an actual risk.  time_t should
be avoided for anything visible to extension modules, however.
2008-02-17 02:09:32 +00:00
Tom Lane ace1b29b04 Fix two different copy-and-paste-os in CSV log rotation logic; one that led to
a double-pfree crash and another that effectively disabled size-based rotation
for CSV logs.  Also suppress a memory leak and make some trivial cosmetic
improvements.  Per bug #3901 from Chris Hoover and additional code-reading.
2008-01-25 20:42:10 +00:00
Alvaro Herrera 7aa4164363 Mark autovacuum entries in pg_stat_activity so that they can be easily
distinguished from user-invoked commands.  Per suggestion from Tom Lane.
2008-01-14 13:39:25 +00:00
Tom Lane e6a442c71b Restructure the shutdown procedure for the archiver process to allow it to
finish archiving everything (when there's no error), and to eliminate various
hazards as best we can.  This fixes a previous 8.3 patch that caused the
postmaster to kill and then restart the archiver during shutdown (!?).

The new behavior is that the archiver is allowed to run unmolested until
the bgwriter has exited; then it is sent SIGUSR2 to tell it to do a final
archiving cycle and quit.  We only SIGQUIT the archiver if we want a panic
stop; this is important since SIGQUIT will also be sent to any active
archive_command.  The postmaster also now doesn't SIGQUIT the stats collector
until the bgwriter is done, since the bgwriter can send stats messages in 8.3.
The postmaster will not exit until both the archiver and stats collector are
gone; this provides some defense (not too bulletproof) against conflicting
archiver or stats collector processes being started by a new postmaster
instance.  We continue the prior practice that the archiver will check
for postmaster death immediately before issuing any archive_command; that
gives some additional protection against conflicting archivers.

Also, modify the archiver process to notice SIGTERM and refuse to issue any
more archive commands if it gets it.  The postmaster doesn't ever send it
SIGTERM; we assume that any such signal came from init and is a notice of
impending whole-system shutdown.  In this situation it seems imprudent to try
to start new archive commands --- if they aren't extremely quick they're
likely to get SIGKILL'd by init.

All per discussion.
2008-01-11 00:54:09 +00:00