Commit Graph

368 Commits

Author SHA1 Message Date
Robert Haas b89e151054 Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables.  The output format is controlled by a
so-called "output plugin"; an example is included.  To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.

Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.

Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 16:32:18 -05:00
Bruce Momjian 7e04792a1c Update copyright for 2014
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
2014-01-07 16:05:30 -05:00
Alvaro Herrera dd778e9d88 Rename various "freeze multixact" variables
It seems to make more sense to use "cutoff multixact" terminology
throughout the backend code; "freeze" is associated with replacing of an
Xid with FrozenTransactionId, which is not what we do for MultiXactIds.

Andres Freund
Some adjustments by Álvaro Herrera
2013-09-16 15:47:31 -03:00
Robert Haas 568d4138c6 Use an MVCC snapshot, rather than SnapshotNow, for catalog scans.
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row.  In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result.  This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.

The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow.  However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads.  To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed.  The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all.  Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.

Patch by me.  Review by Michael Paquier and Andres Freund.
2013-07-02 09:47:01 -04:00
Bruce Momjian 9af4159fce pgindent run for release 9.3
This is the first run of the Perl-based pgindent script.  Also update
pgindent instructions.
2013-05-29 16:58:43 -04:00
Robert Haas 05f3f9c7b2 Extend object-access hook machinery to support post-alter events.
This also slightly widens the scope of what we support in terms of
post-create events.

KaiGai Kohei, with a few changes, mostly to the comments, by me
2013-03-17 22:57:26 -04:00
Robert Haas f90cc26982 Code beautification for object-access hook machinery.
KaiGai Kohei
2013-03-06 20:53:25 -05:00
Alvaro Herrera 0ac5ad5134 Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE".  UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.

Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.

The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid.  Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates.  This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed.  pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.

Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header.  This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.

Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)

With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.

As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.

Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane.  There's probably room for several more tests.

There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it.  Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.

This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
	1290721684-sup-3951@alvh.no-ip.org
	1294953201-sup-2099@alvh.no-ip.org
	1320343602-sup-2290@alvh.no-ip.org
	1339690386-sup-8927@alvh.no-ip.org
	4FE5FF020200002500048A3D@gw.wicourts.gov
	4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 12:04:59 -03:00
Tom Lane c2a14bc7c9 Protect against SnapshotNow race conditions in pg_tablespace scans.
Use of SnapshotNow is known to expose us to race conditions if the tuple(s)
being sought could be updated by concurrently-committing transactions.
CREATE DATABASE and DROP DATABASE are particularly exposed because they do
heavyweight filesystem operations during their scans of pg_tablespace,
so that the scans run for a very long time compared to most.  Furthermore,
the potential consequences of a missed or twice-visited row are nastier
than average:

* createdb() could fail with a bogus "file already exists" error, or
  silently fail to copy one or more tablespace's worth of files into the
  new database.

* remove_dbtablespaces() could miss one or more tablespaces, thus failing
  to free filesystem space for the dropped database.

* check_db_file_conflict() could likewise miss a tablespace, leading to an
  OID conflict that could result in data loss either immediately or in
  future operations.  (This seems of very low probability, though, since a
  duplicate database OID would be unlikely to start with.)

Hence, it seems worth fixing these three places to use MVCC snapshots, even
though this will someday be superseded by a generic solution to SnapshotNow
race conditions.

Back-patch to all active branches.

Stephen Frost and Tom Lane
2013-01-18 18:06:20 -05:00
Bruce Momjian bd61a623ac Update copyrights for 2013
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
2013-01-01 17:15:01 -05:00
Robert Haas 82b1b213ca Adjust more backend functions to return OID rather than void.
This is again intended to support extensions to the event trigger
functionality.  This may go a bit further than we need for that
purpose, but there's some value in being consistent, and the OID
may be useful for other purposes also.

Dimitri Fontaine
2012-12-29 07:55:37 -05:00
Robert Haas c504513f83 Adjust many backend functions to return OID rather than void.
Extracted from a larger patch by Dimitri Fontaine.  It is hoped that
this will provide infrastructure for enriching the new event trigger
functionality, but it seems possibly useful for other purposes as
well.
2012-12-23 18:37:58 -05:00
Alvaro Herrera 1577b46b7c Split out rmgr rm_desc functions into their own files
This is necessary (but not sufficient) to have them compilable outside
of a backend environment.
2012-11-28 13:01:15 -03:00
Alvaro Herrera c219d9b0a5 Split tuple struct defs from htup.h to htup_details.h
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.

I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h.  In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
2012-08-30 16:52:35 -04:00
Alvaro Herrera 45326c5a11 Split resowner.h
This lets files that are mere users of ResourceOwner not automatically
include the headers for stuff that is managed by the resowner mechanism.
2012-08-28 18:02:07 -04:00
Peter Eisentraut d933092e0a Add more message pluralization
Even though we can't do much about the case with multiple plurals in
one sentence, we can fix the other cases.
2012-06-15 02:02:02 +03:00
Bruce Momjian 927d61eeff Run pgindent on 9.2 source tree in preparation for first 9.3
commit-fest.
2012-06-10 15:20:04 -04:00
Peter Eisentraut 64e1309c76 Consistently quote encoding and locale names in messages 2012-04-13 20:37:07 +03:00
Tom Lane c7cea267de Replace empty locale name with implied value in CREATE DATABASE and initdb.
setlocale() accepts locale name "" as meaning "the locale specified by the
process's environment variables".  Historically we've accepted that for
Postgres' locale settings, too.  However, it's fairly unsafe to store an
empty string in a new database's pg_database.datcollate or datctype fields,
because then the interpretation could vary across postmaster restarts,
possibly resulting in index corruption and other unpleasantness.

Instead, we should expand "" to whatever it means at the moment of calling
CREATE DATABASE, which we can do by saving the value returned by
setlocale().

For consistency, make initdb set up the initial lc_xxx parameter values the
same way.  initdb was already doing the right thing for empty locale names,
but it did not replace non-empty names with setlocale results.  On a
platform where setlocale chooses to canonicalize the spellings of locale
names, this would result in annoying inconsistency.  (It seems that popular
implementations of setlocale don't do such canonicalization, which is a
pity, but the POSIX spec certainly allows it to be done.)  The same risk
of inconsistency leads me to not venture back-patching this, although it
could certainly be seen as a longstanding bug.

Per report from Jeff Davis, though this is not his proposed patch.
2012-03-25 21:47:22 -04:00
Robert Haas 07d1edb954 Extend object access hook framework to support arguments, and DROP.
This allows loadable modules to get control at drop time, perhaps for the
purpose of performing additional security checks or to log the event.
The initial purpose of this code is to support sepgsql, but other
applications should be possible as well.

KaiGai Kohei, reviewed by me.
2012-03-09 14:34:56 -05:00
Bruce Momjian e126958c2e Update copyright notices for year 2012. 2012-01-01 18:01:58 -05:00
Simon Riggs f3ebaad45b Comment changes to show bgwriter no longer performs checkpoints. 2011-11-01 18:48:47 +00:00
Bruce Momjian 6416a82a62 Remove unnecessary #include references, per pgrminclude script. 2011-09-01 10:04:27 -04:00
Robert Haas 463f2625a5 Support SECURITY LABEL on databases, tablespaces, and roles.
This requires a new shared catalog, pg_shseclabel.

Along the way, fix the security_label regression tests so that they
don't monkey with the labels of any pre-existing objects.  This is
unlikely to matter in practice, since only the label for the "dummy"
provider was being manipulated.  But this way still seems cleaner.

KaiGai Kohei, with fairly extensive hacking by me.
2011-07-20 13:18:24 -04:00
Bruce Momjian bf50caf105 pgindent run before PG 9.1 beta 1. 2011-04-10 11:42:00 -04:00
Peter Eisentraut b313bca0af DDL support for collations
- collowner field
- CREATE COLLATION
- ALTER COLLATION
- DROP COLLATION
- COMMENT ON COLLATION
- integration with extensions
- pg_dump support for the above
- dependency management
- psql tab completion
- psql \dO command
2011-02-12 15:55:18 +02:00
Peter Eisentraut 414c5a2ea6 Per-column collation support
This adds collation support for columns and domains, a COLLATE clause
to override it per expression, and B-tree index support.

Peter Eisentraut
reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
2011-02-08 23:04:18 +02:00
Bruce Momjian 5d950e3b0c Stamp copyrights for year 2011. 2011-01-01 13:18:15 -05:00
Robert Haas cc1ed40d57 Object access hook framework, with post-creation hook.
After a SQL object is created, we provide an opportunity for security
or logging plugins to get control; for example, a security label provider
could use this to assign an initial security label to newly created
objects.  The basic infrastructure is (hopefully) reusable for other types
of events that might require similar treatment.

KaiGai Kohei, with minor adjustments.
2010-11-25 11:50:13 -05:00
Robert Haas 11e482c350 Move copydir() prototype into its own header file.
Having this in src/include/port.h makes no sense, now that copydir.c lives
in src/backend/strorage rather than src/port.  Along the way, remove an
obsolete comment from contrib/pg_upgrade that makes reference to the old
location.
2010-11-12 16:39:53 -05:00
Magnus Hagander 9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Robert Haas 2a6ef3445c Standardize get_whatever_oid functions for object types with
unqualified names.

- Add a missing_ok parameter to get_tablespace_oid.
- Avoid duplicating get_tablespace_od guts in objectNamesToOids.
- Add a missing_ok parameter to get_database_oid.
- Replace get_roleid and get_role_checked with get_role_oid.
- Add get_namespace_oid, get_language_oid, get_am_oid.
- Refactor existing code to use new interfaces.

Thanks to KaiGai Kohei for the review.
2010-08-05 14:45:09 +00:00
Bruce Momjian c86f467d18 Properly replay CREATE TABLESPACE during crash recovery by deleting
directory/symlink before creation.

Report from Tom Lane.

Backpatch to 9.0.
2010-07-20 18:14:16 +00:00
Bruce Momjian 65e806cba1 pgindent run for 9.0 2010-02-26 02:01:40 +00:00
Robert Haas e26c539e9f Wrap calls to SearchSysCache and related functions using macros.
The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4).  This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.

Design and review by Tom Lane.
2010-02-14 18:42:19 +00:00
Simon Riggs 8bfd1a8848 Lock database while running drop database in Hot Standby to protect
against concurrent reconnection. Failure during testing showed issue
was possible, even though earlier analysis seemed to indicate it
would not be required. Use LockSharedObjectForSession() before
ResolveRecoveryConflictWithDatabase() and hold lock until end of
processing for that WAL record. Simple approach to avoid introducing
further bugs at this stage of development on an improbable issue.
2010-01-16 14:16:31 +00:00
Simon Riggs e99767bc28 First part of refactoring of code for ResolveRecoveryConflict. Purposes
of this are to centralise the conflict code to allow further change,
as well as to allow passing through the full reason for the conflict
through to the conflicting backends. Backend state alters how we
can handle different types of conflict so this is now required.
As originally suggested by Heikki, no longer optional.
2010-01-14 11:08:02 +00:00
Simon Riggs 3bfcccc295 During Hot Standby, fix drop database when sessions idle.
Previously we only cancelled sessions that were in-transaction.

Simple fix is to just cancel all sessions without waiting. Doing
it this way avoids complicating common code paths, which would
not be worth the trouble to cover this rare case.

Problem report and fix by Andres Freund, edited somewhat by me
2010-01-10 15:44:28 +00:00
Bruce Momjian 0239800893 Update copyright for the year 2010. 2010-01-02 16:58:17 +00:00
Simon Riggs efc16ea520 Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.

New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.

This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.

Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.

Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 01:32:45 +00:00
Tom Lane 8f8a5df694 Make initdb behave sanely when the selected locale has codeset "US-ASCII".
Per discussion, this should result in defaulting to SQL_ASCII encoding.
The original coding could not support that because it conflated selection
of SQL_ASCII encoding with not being able to determine the encoding.
Adjust pg_get_encoding_from_locale()'s API to distinguish these cases,
and fix callers appropriately.  Only initdb actually changes behavior,
since the other callers were perfectly content to consider these cases
equivalent.

Per bug #5178 from Boh Yap.  Not going to bother back-patching, since
no one has complained before and there's an easy workaround (namely,
specify the encoding you want).
2009-11-12 02:46:16 +00:00
Alvaro Herrera 2eda8dfb52 Make it possibly to specify GUC params per user and per database.
Create a new catalog pg_db_role_setting where they are now stored, and better
encapsulate the code that deals with settings into its realm.  The old
datconfig and rolconfig columns are removed.

psql has gained a \drds command to display the settings.

Backwards compatibility warning: while the backwards-compatible system views
still have the config columns, they no longer completely represent the
configuration for a user or database.

Catalog version bumped.
2009-10-07 22:14:26 +00:00
Alvaro Herrera a8bb8eb583 Remove flatfiles.c, which is now obsolete.
Recent commits have removed the various uses it was supporting.  It was a
performance bottleneck, according to bug report #4919 by Lauris Ulmanis; seems
it slowed down user creation after a billion users.
2009-09-01 02:54:52 +00:00
Bruce Momjian d747140279 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list
provided by Andrew.
2009-06-11 14:49:15 +00:00
Tom Lane 421c66b76c Modify CREATE DATABASE to enforce that the source database's encoding setting
must be used for the new database, except when copying from template0.
This is the same rule that we now enforce for locale settings, and it has
the same motivation: databases other than template0 might contain data that
would be invalid according to a different setting.  This represents another
step in a continuing process of locking down ways in which encoding violations
could occur inside the backend.  Per discussion of a few days ago.

In passing, fix pre-existing breakage of mbregress.sh, and fix up a couple
of ereport() calls in dbcommands.c that failed to specify sqlstate codes.
2009-05-06 16:15:21 +00:00
Tom Lane d7ee335520 Tweak a comment to agree a bit better with the new dispensation that
locales are database-wide, not server-wide.
2009-05-05 23:39:55 +00:00
Tom Lane b9e9775e0c Don't use the result of strcmp as if it were a boolean.
A service of your local coding style police.
2009-04-23 17:39:21 +00:00
Alvaro Herrera 1bb257fae6 Add missing periods. 2009-04-15 21:36:12 +00:00
Heikki Linnakangas 1eef90d0a2 Rename the new CREATE DATABASE options to set collation and ctype into
LC_COLLATE and LC_CTYPE, per discussion on pgsql-hackers.
2009-04-06 08:42:53 +00:00
Heikki Linnakangas 4265ed9f4e Check that connection limit is within valid range. IOW, not < -1.
It's missing in older versions too, but it doesn't seem worth
back-porting. All negative are just harmlessly treated as "no limit", and
tightening the check might even brake an application that relies on it.
2009-01-30 17:24:47 +00:00
Heikki Linnakangas b2a667b9ee Add a new option to RestoreBkpBlocks() to indicate if a cleanup lock should
be used instead of the normal exclusive lock, and make WAL redo functions
responsible for calling RestoreBkpBlocks(). They know better what kind of a
lock they need.

At the moment, this just moves things around with no functional change, but
makes the hot standby patch that's under review cleaner.
2009-01-20 18:59:37 +00:00
Bruce Momjian 511db38ace Update copyright for 2009. 2009-01-01 17:24:05 +00:00
Tom Lane 6517f377d6 Implement ALTER DATABASE SET TABLESPACE to move a whole database (or at least
as much of it as lives in its default tablespace) to a new tablespace.

Guillaume Lelarge, with some help from Bernd Helmle and Tom Lane
2008-11-07 18:25:07 +00:00
Tom Lane 902d1cb35f Remove all uses of the deprecated functions heap_formtuple, heap_modifytuple,
and heap_deformtuple in favor of the newer functions heap_form_tuple et al
(which do the same things but use bool control flags instead of arbitrary
char values).  Eliminate the former duplicate coding of these functions,
reducing the deprecated functions to mere wrappers around the newer ones.
We can't get rid of them entirely because add-on modules probably still
contain many instances of the old coding style.

Kris Jurka
2008-11-02 01:45:28 +00:00
Heikki Linnakangas db31addaae Force a checkpoint in CREATE DATABASE before starting to copy the files,
to process any pending unlinks for the source database.

Before, if you dropped a relation in the template database just before
CREATE DATABASE, and a checkpoint happened during copydir(), the checkpoint
might delete a file that we're just about to copy, causing lstat() in
copydir() to fail with ENOENT.

Backpatch to 8.3, where the pending unlinks were introduced.

Per report by Matthew Wakeling and analysis by Tom Lane.
2008-10-09 10:34:06 +00:00
Heikki Linnakangas 15c121b3ed Rewrite the FSM. Instead of relying on a fixed-size shared memory segment, the
free space information is stored in a dedicated FSM relation fork, with each
relation (except for hash indexes; they don't use FSM).

This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any
trace of them from the backend, initdb, and documentation.

Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also
introduce a new variant of the get_raw_page(regclass, int4, int4) function in
contrib/pageinspect that let's you to return pages from any relation fork, and
a new fsm_page_contents() function to inspect the new FSM pages.
2008-09-30 10:52:14 +00:00
Heikki Linnakangas c2d4526495 Tighten the check in initdb and CREATE DATABASE that the chosen encoding
matches the encoding of the locale. LC_COLLATE is now checked in addition
to LC_CTYPE.
2008-09-23 10:58:03 +00:00
Heikki Linnakangas 61d9674988 Make LC_COLLATE and LC_CTYPE database-level settings. Collation and
ctype are now more like encoding, stored in new datcollate and datctype
columns in pg_database.

This is a stripped-down version of Radek Strnad's patch, with further
changes by me.
2008-09-23 09:20:39 +00:00
Tom Lane 4abd7b49f1 Improve CREATE/DROP/RENAME DATABASE so that when failing because the source
or target database is being accessed by other users, it tells you whether
the "other users" are live sessions or uncommitted prepared transactions.
(Indeed, it tells you exactly how many of each, but that's mostly just
because it was easy to do so.)  This should help forestall the gotcha of
not realizing that a prepared transaction is what's blocking the command.
Per discussion.
2008-08-04 18:03:46 +00:00
Alvaro Herrera f8c4d7db60 Restructure some header files a bit, in particular heapam.h, by removing some
unnecessary #include lines in it.  Also, move some tuple routine prototypes and
macros to htup.h, which allows removal of heapam.h inclusion from some .c
files.

For this to work, a new header file access/sysattr.h needed to be created,
initially containing attribute numbers of system columns, for pg_dump usage.

While at it, make contrib ltree, intarray and hstore header files more
consistent with our header style.
2008-05-12 00:00:54 +00:00
Tom Lane b8e5581d76 Fix rmtree() so that it keeps going after failure to remove any individual
file; the idea is that we should clean up as much as we can, even if there's
some problem removing one file.  Make the error messages a bit less misleading,
too.  In passing, const-ify function arguments.
2008-04-18 17:05:45 +00:00
Heikki Linnakangas 9cb91f90c9 Fix two race conditions between the pending unlink mechanism that was put in
place to prevent reusing relation OIDs before next checkpoint, and DROP
DATABASE. First, if a database was dropped, bgwriter would still try to unlink
the files that the rmtree() call by the DROP DATABASE command has already
deleted, or is just about to delete. Second, if a database is dropped, and
another database is created with the same OID, bgwriter would in the worst
case delete a relation in the new database that happened to get the same OID
as a dropped relation in the old database.

To fix these race conditions:
- make rmtree() ignore ENOENT errors. This fixes the 1st race condition.
- make ForgetDatabaseFsyncRequests forget unlink requests as well.
- force checkpoint on in dropdb on all platforms

Since ForgetDatabaseFsyncRequests() is asynchronous, the 2nd change isn't
enough on its own to fix the problem of dropping and creating a database with
same OID, but forcing a checkpoint on DROP DATABASE makes it sufficient.

Per Tom Lane's bug report and proposal. Backpatch to 8.3.
2008-04-18 06:48:38 +00:00
Tom Lane d1cbd26ded Repair two places where SIGTERM exit could leave shared memory state
corrupted.  (Neither is very important if SIGTERM is used to shut down the
whole database cluster together, but there's a problem if someone tries to
SIGTERM individual backends.)  To do this, introduce new infrastructure
macros PG_ENSURE_ERROR_CLEANUP/PG_END_ENSURE_ERROR_CLEANUP that take care
of transiently pushing an on_shmem_exit cleanup hook.  Also use this method
for createdb cleanup --- that wasn't a shared-memory-corruption problem,
but SIGTERM abort of createdb could leave orphaned files lying around.

Backpatch as far as 8.2.  The shmem corruption cases don't exist in 8.1,
and the createdb usage doesn't seem important enough to risk backpatching
further.
2008-04-16 23:59:40 +00:00
Alvaro Herrera 73b0300b2a Move the HTSU_Result enum definition into snapshot.h, to avoid including
tqual.h into heapam.h.  This makes all inclusion of tqual.h explicit.

I also sorted alphabetically the includes on some source files.
2008-03-26 21:10:39 +00:00
Bruce Momjian 9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Magnus Hagander 699a0ef7bb Re-allow UTF8 encodings on win32. Since UTF8 is converted to
UTF16 before being used, all (valid) locales will work for this.
2007-10-16 11:30:16 +00:00
Tom Lane 8468146b03 Fix the inadvertent libpq ABI breakage discovered by Martin Pitt: the
renumbering of encoding IDs done between 8.2 and 8.3 turns out to break 8.2
initdb and psql if they are run with an 8.3beta1 libpq.so.  For the moment
we can rearrange the order of enum pg_enc to keep the same number for
everything except PG_JOHAB, which isn't a problem since there are no direct
references to it in the 8.2 programs anyway.  (This does force initdb
unfortunately.)

Going forward, we want to fix things so that encoding IDs can be changed
without an ABI break, and this commit includes the changes needed to allow
libpq's encoding IDs to be treated as fully independent of the backend's.
The main issue is that libpq clients should not include pg_wchar.h or
otherwise assume they know the specific values of libpq's encoding IDs,
since they might encounter version skew between pg_wchar.h and the libpq.so
they are using.  To fix, have libpq officially export functions needed for
encoding name<=>ID conversion and validity checking; it was doing this
anyway unofficially.

It's still the case that we can't renumber backend encoding IDs until the
next bump in libpq's major version number, since doing so will break the
8.2-era client programs.  However the code is now prepared to avoid this
type of problem in future.

Note that initdb is no longer a libpq client: we just pull in the two
source files we need directly.  The patch also fixes a few places that
were being sloppy about checking for an unrecognized encoding name.
2007-10-13 20:18:42 +00:00
Tom Lane 6daef2bca4 Remove hack in pg_tablespace_aclmask() that disallowed permissions
on pg_global even to superusers, and replace it with checks in various
other places to complain about invalid uses of pg_global.  This ends
up being a bit more code but it allows a more specific error message
to be given, and it un-breaks pg_tablespace_size() on pg_global.
Per discussion.
2007-10-12 18:55:12 +00:00
Tom Lane 70b9b9b788 Change initdb and CREATE DATABASE to actively reject attempts to create
databases with encodings that are incompatible with the server's LC_CTYPE
locale, when we can determine that (which we can on most modern platforms,
I believe).  C/POSIX locale is compatible with all encodings, of course,
so there is still some usefulness to CREATE DATABASE's ENCODING option,
but this will insulate us against all sorts of recurring complaints
caused by mismatched settings.

I moved initdb's existing LC_CTYPE-to-encoding mapping knowledge into
a new src/port/ file so it could be shared by CREATE DATABASE.
2007-09-28 22:25:49 +00:00
Tom Lane e7889b83b7 Support SET FROM CURRENT in CREATE/ALTER FUNCTION, ALTER DATABASE, ALTER ROLE.
(Actually, it works as a plain statement too, but I didn't document that
because it seems a bit useless.)  Unify VariableResetStmt with
VariableSetStmt, and clean up some ancient cruft in the representation of
same.
2007-09-03 18:46:30 +00:00
Tom Lane 4a78cdeb6b Support an optional asynchronous commit mode, in which we don't flush WAL
before reporting a transaction committed.  Data consistency is still
guaranteed (unlike setting fsync = off), but a crash may lose the effects
of the last few transactions.  Patch by Simon, some editorialization by Tom.
2007-08-01 22:45:09 +00:00
Tom Lane 867e2c91a0 Implement "distributed" checkpoints in which the checkpoint I/O is spread
over a fairly long period of time, rather than being spat out in a burst.
This happens only for background checkpoints carried out by the bgwriter;
other cases, such as a shutdown checkpoint, are still done at full speed.

Remove the "all buffers" scan in the bgwriter, and associated stats
infrastructure, since this seems no longer very useful when the checkpoint
itself is properly throttled.

Original patch by Itagaki Takahiro, reworked by Heikki Linnakangas,
and some minor API editorialization by me.
2007-06-28 00:02:40 +00:00
Tom Lane bd0a260928 Make CREATE/DROP/RENAME DATABASE wait a little bit to see if other backends
will exit before failing because of conflicting DB usage.  Per discussion,
this seems a good idea to help mask the fact that backend exit takes nonzero
time.  Remove a couple of thereby-obsoleted sleeps in contrib and PL
regression test sequences.
2007-06-01 19:38:07 +00:00
Tom Lane ebb6bae539 Cancel pending fsync requests during WAL replay of DROP DATABASE, per bug
report from David Darville.  Back-patch as far as 8.1, which may or may not
have the problem but it seems a safe change anyway.
2007-04-12 15:04:35 +00:00
Tom Lane b9527e9840 First phase of plan-invalidation project: create a plan cache management
module and teach PREPARE and protocol-level prepared statements to use it.
In service of this, rearrange utility-statement processing so that parse
analysis does not assume table schemas can't change before execution for
utility statements (necessary because we don't attempt to re-acquire locks
for utility statements when reusing a stored plan).  This requires some
refactoring of the ProcessUtility API, but it ends up cleaner anyway,
for instance we can get rid of the QueryContext global.

Still to do: fix up SPI and related code to use the plan cache; I'm tempted to
try to make SQL functions use it too.  Also, there are at least some aspects
of system state that we want to ensure remain the same during a replan as in
the original processing; search_path certainly ought to behave that way for
instance, and perhaps there are others.
2007-03-13 00:33:44 +00:00
Tom Lane f44271176e Call pgstat_drop_database during DROP DATABASE, so that any stats file
entries for the victim database go away sooner rather than later.  We already
did the equivalent thing at the per-relation level, not sure why it's not
been done for whole databases.  With this change, pgstat_vacuum_tabstat
should usually not find anything to do; though we still need it as a backstop
in case DROPDB or TABPURGE messages get lost under load.
2007-02-09 16:12:19 +00:00
Bruce Momjian 8b4ff8b6a1 Wording cleanup for error messages. Also change can't -> cannot.
Standard English uses "may", "can", and "might" in different ways:

        may - permission, "You may borrow my rake."

        can - ability, "I can lift that log."

        might - possibility, "It might rain today."

Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice.  Similarly, "It may crash" is better stated, "It might crash".
2007-02-01 19:10:30 +00:00
Tom Lane eddbf39756 Extend yesterday's patch so that the bgwriter is also told to forget
pending fsyncs during DROP DATABASE.  Obviously necessary in hindsight :-(
2007-01-17 16:25:01 +00:00
Alvaro Herrera eb63cc3da8 Arrange for autovacuum to be killed when another operation wants to be alone
accessing it, like DROP DATABASE.  This allows the regression tests to pass
with autovacuum enabled, which open the gates for finally enabling autovacuum
by default.
2007-01-16 13:28:57 +00:00
Bruce Momjian 29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane 48188e1621 Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios.  We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases.  Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId.  Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done.  Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database.  initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs.  Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 22:42:10 +00:00
Tom Lane 1e758d5263 Add some code to CREATE DATABASE to check for pre-existing subdirectories
that conflict with the OID that we want to use for the new database.
This avoids the risk of trying to remove files that maybe we shouldn't
remove.  Per gripe from Jon Lapham and subsequent discussion of 27-Sep.
2006-10-18 22:44:12 +00:00
Bruce Momjian f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Bruce Momjian e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Bruce Momjian a22d76d96a Allow include files to compile own their own.
Strip unused include files out unused include files, and add needed
includes to C files.

The next step is to remove unused include files in C files.
2006-07-13 16:49:20 +00:00
Alvaro Herrera d4cef0aa2a Improve vacuum code to track minimum Xids per table instead of per database.
To this end, add a couple of columns to pg_class, relminxid and relvacuumxid,
based on which we calculate the pg_database columns after each vacuum.

We now force all databases to be vacuumed, even template ones.  A backend
noticing too old a database (meaning pg_database.datminxid is in danger of
falling behind Xid wraparound) will signal the postmaster, which in turn will
start an autovacuum iteration to process the offending database.  In principle
this is only there to cope with frozen (non-connectable) databases without
forcing users to set them to connectable, but it could force regular user
database to go through a database-wide vacuum at any time.  Maybe we should
warn users about this somehow.  Of course the real solution will be to use
autovacuum all the time ;-)

There are some additional improvements we could have in this area: for example
the vacuum code could be smarter about not updating pg_database for each table
when called by autovacuum, and do it only once the whole autovacuum iteration
is done.

I updated the system catalogs documentation, but I didn't modify the
maintenance section.  Also having some regression tests for this would be nice
but it's not really a very straightforward thing to do.

Catalog version bumped due to system catalog changes.
2006-07-10 16:20:52 +00:00
Tom Lane 52667d56a3 Rethink the locking mechanisms used for CREATE/DROP/RENAME DATABASE.
The former approach used ExclusiveLock on pg_database, which being a
cluster-wide lock meant only one of these operations could proceed at
a time; worse, it also blocked all incoming connections in ReverifyMyDatabase.
Now that we have LockSharedObject(), we can use locks of different types
applied to databases considered as objects.  This allows much more
flexible management of the interlocking: two CREATE DATABASEs need not
block each other, and need not block connections except to the template
database being used.  Similarly DROP DATABASE doesn't block unrelated
operations.  The locking used in flatfiles.c is also much narrower in
scope than before.  Per recent proposal.
2006-05-04 16:07:29 +00:00
Tom Lane cb98e6fb8f Create a syscache for pg_database-indexed-by-oid, and make use of it
in various places that were previously doing ad hoc pg_database searches.
This may speed up database-related privilege checks a little bit, but
the main motivation is to eliminate the performance reason for having
ReverifyMyDatabase do such a lot of stuff (viz, avoiding repeat scans
of pg_database during backend startup).  The locking reason for having
that routine is about to go away, and it'd be good to have the option
to break it up.
2006-05-03 22:45:26 +00:00
Tom Lane 6d61cdec07 Clean up and document the API for XLogOpenRelation and XLogReadBuffer.
This commit doesn't make much functional change, but it does eliminate some
duplicated code --- for instance, PageIsNew tests are now done inside
XLogReadBuffer rather than by each caller.
The GIST xlog code still needs a lot of love, but I'll worry about that
separately.
2006-03-29 21:17:39 +00:00
Tom Lane 0a20207060 Arrange to emit a description of the current XLOG record as error context
when an error occurs during xlog replay.  Also, replace the former risky
'write into a fixed-size buffer with no overflow detection' API for XLOG
record description routines; use an expansible StringInfo instead.  (The
latter accounts for most of the patch bulk.)

Qingqing Zhou
2006-03-24 04:32:13 +00:00
Bruce Momjian f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Bruce Momjian f9a726aa88 I've created a new shared catalog table pg_shdescription to store
comments on cluster global objects like databases, tablespaces, and
roles.

It touches a lot of places, but not much in the way of big changes.  The
only design decision I made was to duplicate the query and manipulation
functions rather than to try and have them handle both shared and local
comments.  I believe this is simpler for the code and not an issue for
callers because they know what type of object they are dealing with.
This has resulted in a shobj_description function analagous to
obj_description and backend functions [Create/Delete]SharedComments
mirroring the existing [Create/Delete]Comments functions.

pg_shdescription.h goes into src/include/catalog/

Kris Jurka
2006-02-12 03:22:21 +00:00
Bruce Momjian 436a2956d8 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:17:34 +00:00
Andrew Dunstan 5b352d8e12 DROP DATABASE IF EXISTS variant 2005-11-22 15:24:18 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane 375e7d5579 Use a safer order of operations in dropdb(): rollbackable operations,
ie removing shared-dependency entries, should happen before non-rollbackable
ones.  That way a failure during the rollbackable part doesn't leave us
with inconsistent state.
2005-10-10 20:02:20 +00:00
Tom Lane bf1e33d24a Fix unwanted denial of ALTER OWNER rights to superusers. There was some
discussion of getting around this by relaxing the checks made for regular
users, but I'm disinclined to toy with the security model right now,
so just special-case it for superusers where needed.
2005-08-22 17:38:20 +00:00
Tom Lane 721e53785d Solve the problem of OID collisions by probing for duplicate OIDs
whenever we generate a new OID.  This prevents occasional duplicate-OID
errors that can otherwise occur once the OID counter has wrapped around.
Duplicate relfilenode values are also checked for when creating new
physical files.  Per my recent proposal.
2005-08-12 01:36:05 +00:00
Tom Lane 558730ac6b Clean up CREATE DATABASE processing to make it more robust and get rid
of special case for Windows port.  Put a PG_TRY around most of createdb()
to ensure that we remove copied subdirectories on failure, even if the
failure happens while creating the pg_database row.  (I think this explains
Oliver Siegmar's recent report.)  Having done that, there's no need for
the fragile assumption that copydir() mustn't ereport(ERROR), so simplify
its API.  Eliminate the old code that used system("cp ...") to copy
subdirectories, in favor of using copydir() on all platforms.  This not
only should allow much better error reporting, but allows us to fsync
the created files before trusting that the copy has succeeded.
2005-08-02 19:02:32 +00:00
Tom Lane d42cf5a42a Add per-user and per-database connection limit options.
This patch also includes preliminary update of pg_dumpall for roles.
Petr Jelinek, with review by Bruce Momjian and Tom Lane.
2005-07-31 17:19:22 +00:00
Tom Lane aa1110624c Adjust permissions checking for ALTER OWNER commands: instead of
requiring superuserness always, allow an owner to reassign ownership
to any role he is a member of, if that role would have the right to
create a similar object.  These three requirements essentially state
that the would-be alterer has enough privilege to DROP the existing
object and then re-CREATE it as the new role; so we might as well
let him do it in one step.  The ALTER TABLESPACE case is a bit
squirrely, but the whole concept of non-superuser tablespace owners
is pretty dubious anyway.  Stephen Frost, code review by Tom Lane.
2005-07-14 21:46:30 +00:00
Neil Conway 40ffa1a14c Remove some dead code for handling XLOG_DBASE_CREATE_OLD and
XLOG_DBASE_DROP_OLD WAL records -- these records are no longer created in
current sources. Adjust numbering of XLOG_DBASE_CREATE and XLOG_DBASE_DROP
and bump the catversion. Patch from Gavin Sherry, adjusted by Neil Conway.
2005-07-08 04:12:27 +00:00
Tom Lane 59d1b3d99e Track dependencies on shared objects (which is to say, roles; we already
have adequate mechanisms for tracking the contents of databases and
tablespaces).  This solves the longstanding problem that you can drop a
user who still owns objects and/or has access permissions.
Alvaro Herrera, with some kibitzing from Tom Lane.
2005-07-07 20:40:02 +00:00
Tom Lane 401de9c8be Improve the checkpoint signaling mechanism so that the bgwriter can tell
the difference between checkpoints forced due to WAL segment consumption
and checkpoints forced for other reasons (such as CREATE DATABASE).  Avoid
generating 'checkpoints are occurring too frequently' messages when the
checkpoint wasn't caused by WAL segment consumption.  Per gripe from
Chris K-L.
2005-06-30 00:00:52 +00:00
Tom Lane c33d575899 More cleanup on roles patch. Allow admin option to be inherited through
role memberships; make superuser/createrole distinction do something
useful; fix some locking and CommandCounterIncrement issues; prevent
creation of loops in the membership graph.
2005-06-29 20:34:15 +00:00
Tom Lane 7762619e95 Replace pg_shadow and pg_group by new role-capable catalogs pg_authid
and pg_auth_members.  There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance).  But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies.  The catalog changes should
be pretty much done.
2005-06-28 05:09:14 +00:00
Tom Lane fbcbc5d06f Force a checkpoint before committing a CREATE DATABASE command. This
should fix the recent reports of "index is not a btree" failures,
as well as preventing a more obscure race condition involving changes
to a template database just after copying it with CREATE DATABASE.
2005-06-25 22:47:29 +00:00
Tom Lane 6f7fc0bade Cause initdb to create a third standard database "postgres", which
unlike template0 and template1 does not have any special status in
terms of backend functionality.  However, all external utilities such
as createuser and createdb now connect to "postgres" instead of
template1, and the documentation is changed to encourage people to use
"postgres" instead of template1 as a play area.  This should fix some
longstanding gotchas involving unexpected propagation of database
objects by createdb (when you used template1 without understanding
the implications), as well as ameliorating the problem that CREATE
DATABASE is unhappy if anyone else is connected to template1.
Patch by Dave Page, minor editing by Tom Lane.  All per recent
pghackers discussions.
2005-06-21 04:02:34 +00:00
Tom Lane ee7ac7b11e Modify XLogInsert API to make callers specify whether pages to be backed
up have the standard layout with unused space between pd_lower and pd_upper.
When this is set, XLogInsert will omit the unused space without bothering
to scan it to see if it's zero.  That saves time in XLogInsert, and also
allows reversion of my earlier patch to make PageRepairFragmentation et al
explicitly re-zero freed space.  Per suggestion by Heikki Linnakangas.
2005-06-06 20:22:58 +00:00
Tom Lane 4c8495a1f2 Remove the mostly-stubbed-out-anyway support routines for WAL UNDO.
That code is never going to be used in the foreseeable future, and
where it's more than a stub it's making the redo routines harder to
read.
2005-06-06 17:01:25 +00:00
Tom Lane ee3b71f6bc Split the shared-memory array of PGPROC pointers out of the sinval
communication structure, and make it its own module with its own lock.
This should reduce contention at least a little, and it definitely makes
the code seem cleaner.  Per my recent proposal.
2005-05-19 21:35:48 +00:00
Tom Lane 162bd08b3f Completion of project to use fixed OIDs for all system catalogs and
indexes.  Replace all heap_openr and index_openr calls by heap_open
and index_open.  Remove runtime lookups of catalog OID numbers in
various places.  Remove relcache's support for looking up system
catalogs by name.  Bulky but mostly very boring patch ...
2005-04-14 20:03:27 +00:00
Tom Lane cad86e253b WAL must log CREATE and DROP DATABASE operations *without* using any
explicit paths, so that the log can be replayed in a data directory
with a different absolute path than the original had.  To avoid forcing
initdb in the 8.0 branch, continue to accept the old WAL log record
types; they will never again be generated however, and the code can be
dropped after the next forced initdb.  Per report from Oleg Bartunov.
We still need to think about what it really means to WAL-log CREATE
TABLESPACE commands: we more or less have to put the absolute path
into those, but how to replay in a different context??
2005-03-23 00:03:37 +00:00
Tom Lane 78a572bf0c When cloning template0 (or other fully-frozen databases), set the new
database's datallowconn and datfrozenxid to the current transaction ID
instead of copying the source database's values.  This is OK because we
assume the source DB contains no normal transaction IDs whatsoever.
This keeps VACUUM from immediately starting to complain about unvacuumed
databases in the situation where we are more than 2 billion transactions
out from the XID stamp of template0.  Per discussion with Milen Radev
(although his complaint turned out to be due to something else, but the
problem is real anyway).
2005-03-12 21:33:55 +00:00
Tom Lane c7bbe99452 Fix ALTER DATABASE RENAME to allow the operation if user is a superuser
who for some reason isn't marked usecreatedb.  Per report from Alexander
Pravking.  Also fix sloppy coding in have_createdb_privilege().
2005-03-12 21:11:50 +00:00
Tom Lane 5d5087363d Replace the BufMgrLock with separate locks on the lookup hashtable and
the freelist, plus per-buffer spinlocks that protect access to individual
shared buffer headers.  This requires abandoning a global freelist (since
the freelist is a global contention point), which shoots down ARC and 2Q
as well as plain LRU management.  Adopt a clock sweep algorithm instead.
Preliminary results show substantial improvement in multi-backend situations.
2005-03-04 20:21:07 +00:00
Tom Lane 0fc4ecf935 Finish up the flat-files project: get rid of GetRawDatabaseInfo() hack
in favor of looking at the flat file copy of pg_database during backend
startup.  This should finally eliminate the various corner cases in which
backend startup fails unexpectedly because it isn't able to distinguish
live and dead tuples in pg_database.  Simplify locking on pg_database
to be similar to the rules used with pg_shadow and pg_group, and eliminate
FlushRelationBuffers operations that were used only to reduce the odds
of failure of GetRawDatabaseInfo.
initdb forced due to addition of a trigger to pg_database.
2005-02-26 18:43:34 +00:00
Tom Lane 60b2444cc3 Add code to prevent transaction ID wraparound by enforcing a safe limit
in GetNewTransactionId().  Since the limit value has to be computed
before we run any real transactions, this requires adding code to database
startup to scan pg_database and determine the oldest datfrozenxid.
This can conveniently be combined with the first stage of an attack on
the problem that the 'flat file' copies of pg_shadow and pg_group are
not properly updated during WAL recovery.  The code I've added to
startup resides in a new file src/backend/utils/init/flatfiles.c, and
it is responsible for rewriting the flat files as well as initializing
the XID wraparound limit value.  This will eventually allow us to get
rid of GetRawDatabaseInfo too, but we'll need an initdb so we can add
a trigger to pg_database.
2005-02-20 02:22:07 +00:00
Neil Conway a885ecd6ef Change heap_modifytuple() to require a TupleDesc rather than a
Relation. Patch from Alvaro Herrera, minor editorializing by
Neil Conway.
2005-01-27 23:24:11 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane b2a2f4cef7 Force pg_database updates out to disk immediately after ALTER DATABASE;
this is to avoid scenarios where incoming backends find no live copies
of a database's row because the only live copy is in an as-yet-unwritten
shared buffer, which they can't see.  Also, use FlushRelationBuffers()
for forcing out pg_database, instead of the much more expensive BufferSync().
There's no need to write out pages belonging to other relations.
2004-11-18 01:14:26 +00:00
Tom Lane e6f9bf9b7f On Windows, force a checkpoint just before dropping a database's physical
files and directories.  This ensures that the bgwriter will close any open
file references it is holding for files therein, which is needed for the
rmdir() to succeed.  Andrew Dunstan and Tom Lane.
2004-10-28 00:39:59 +00:00
Tom Lane 830c168e5c Give a more user-friendly error message in situation where CREATE DATABASE
specifies a new default tablespace and the template database already has
some tables in that tablespace.  There isn't any way to solve this fully
without modifying the clone database's pg_class contents, so for now the
best we can do is issue a better error message.
2004-10-17 20:47:21 +00:00
Tom Lane 09c6ac9513 Dept. of second thoughts: it'd be a good idea to flush buffers
during replay of CREATE DATABASE as well as the first time around.
Else it's possible that the copy operation will copy obsolete blocks.
We are still a long way from guaranteeing anything about using a
recently-written database as a CREATE template, but this seems needed
to ensure the existing behavior holds up during replay.
2004-08-30 03:50:24 +00:00
Bruce Momjian 15d3f9f6b7 Another pgindent run with lib typedefs added. 2004-08-30 02:54:42 +00:00
Tom Lane 50742aed68 Add WAL logging for CREATE/DROP DATABASE and CREATE/DROP TABLESPACE.
Fix TablespaceCreateDbspace() to be able to create a dummy directory
in place of a dropped tablespace's symlink.  This eliminates the open
problem of a PANIC during WAL replay when a replayed action attempts
to touch a file in a since-deleted tablespace.  It also makes for a
significant improvement in the usability of PITR replay.
2004-08-29 21:08:48 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane d6f8a76cf2 Cause ALTER OWNER commands to update the object's ACL, replacing references
to the old owner with the new owner.  This is not necessarily right, but
it's sure a lot more likely to be what the user wants than doing nothing.
Christopher Kings-Lynne, some rework by Tom Lane.
2004-08-01 20:30:49 +00:00
Bruce Momjian ca9540d34f Add docs for initdb --auth. 2004-08-01 06:19:26 +00:00
Tom Lane 0adfa2c39d Support renaming of tablespaces, and changing the owners of
aggregates, conversions, functions, operators, operator classes,
schemas, types, and tablespaces.  Fold the existing implementations
of alter domain owner and alter database owner in with these.

Christopher Kings-Lynne
2004-06-25 21:55:59 +00:00
Tom Lane 2467394ee1 Tablespaces. Alternate database locations are dead, long live tablespaces.
There are various things left to do: contrib dbsize and oid2name modules
need work, and so does the documentation.  Also someone should think about
COMMENT ON TABLESPACE and maybe RENAME TABLESPACE.  Also initlocation is
dead, it just doesn't know it yet.

Gavin Sherry and Tom Lane.
2004-06-18 06:14:31 +00:00
Bruce Momjian 6cc4175b25 Attached is a patch that takes care of the PATHSEP issue. I made a more
extensive change then what was suggested. I found the file path.c that
contained a lot of "Unix/Windows" agnostic functions so I added a function
there instead and removed the PATHSEP declaration in exec.c altogether. All
to keep things from scattering all over the code.

I also took the liberty of changing the name of the functions
"first_path_sep" and "last_path_sep". Where I come from (and I'm apparently
not alone given the former macro name PATHSEP), they should be called
"first_dir_sep" and "last_dir_sep". The new function I introduced, that
actually finds path separators, is now the "first_path_sep". The patch
contains changes on all affected places of course.

I also changed the documentation on dynamic_library_path to reflect the
chagnes.

Thomas Hallgren
2004-06-10 22:26:24 +00:00
Bruce Momjian cfbfdc557d This patch implement the TODO [ALTER DATABASE foo OWNER TO bar].
It was necessary to touch in grammar and create a new node to make home
to the new syntax. The command is also supported in E
CPG. Doc updates are attached too. Only superusers can change the owner
of the database. New owners don't need any aditional
privileges.

Euler Taveira de Oliveira
2004-05-26 13:57:04 +00:00
Neil Conway d0b4399d81 Reimplement the linked list data structure used throughout the backend.
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.

The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
2004-05-26 04:41:50 +00:00
Bruce Momjian 31338352bd * Most changes are to fix warnings issued when compiling win32
* removed a few redundant defines
* get_user_name safe under win32
* rationalized pipe read EOF for win32 (UPDATED PATCH USED)
* changed all backend instances of sleep() to pg_usleep

    - except for the SLEEP_ON_ASSERT in assert.c, as it would exceed a
32-bit long [Note to patcher: If a SLEEP_ON_ASSERT of 2000 seconds is
acceptable, please replace with pg_usleep(2000000000L)]

I added a comment to that part of the code:

    /*
     *  It would be nice to use pg_usleep() here, but only does 2000 sec
     *  or 33 minutes, which seems too short.
     */
    sleep(1000000);

Claudio Natoli
2004-04-19 17:42:59 +00:00
Tom Lane 87bd956385 Restructure smgr API as per recent proposal. smgr no longer depends on
the relcache, and so the notion of 'blind write' is gone.  This should
improve efficiency in bgwriter and background checkpoint processes.
Internal restructuring in md.c to remove the not-very-useful array of
MdfdVec objects --- might as well just use pointers.
Also remove the long-dead 'persistent main memory' storage manager (mm.c),
since it seems quite unlikely to ever get resurrected.
2004-02-10 01:55:27 +00:00
Neil Conway 192ad63bd7 More janitorial work: remove the explicit casting of NULL literals to a
pointer type when it is not necessary to do so.

For future reference, casting NULL to a pointer type is only necessary
when (a) invoking a function AND either (b) the function has no prototype
OR (c) the function is a varargs function.
2004-01-07 18:56:30 +00:00
Bruce Momjian 19055b78ef Add mention with might need to use cp -R someday for portability. 2003-12-15 22:56:44 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Jan Wieck cfeca62148 Background writer process
This first part of the background writer does no syncing at all.
It's only purpose is to keep the LRU heads clean so that regular
backends seldom to never have to call write().

Jan
2003-11-19 15:55:08 +00:00
Tom Lane fa5c8a055a Cross-data-type comparisons are now indexable by btrees, pursuant to my
pghackers proposal of 8-Nov.  All the existing cross-type comparison
operators (int2/int4/int8 and float4/float8) have appropriate support.
The original proposal of storing the right-hand-side datatype as part of
the primary key for pg_amop and pg_amproc got modified a bit in the event;
it is easier to store zero as the 'default' case and only store a nonzero
when the operator is actually cross-type.  Along the way, remove the
long-since-defunct bigbox_ops operator class.
2003-11-12 21:15:59 +00:00
Tom Lane c1d62bfd00 Add operator strategy and comparison-value datatype fields to ScanKey.
Remove the 'strategy map' code, which was a large amount of mechanism
that no longer had any use except reverse-mapping from procedure OID to
strategy number.  Passing the strategy number to the index AM in the
first place is simpler and faster.
This is a preliminary step in planned support for cross-datatype index
operations.  I'm committing it now since the ScanKeyEntryInitialize()
API change touches quite a lot of files, and I want to commit those
changes before the tree drifts under me.
2003-11-09 21:30:38 +00:00
Peter Eisentraut 7438af96fa More message editing, some suggested by Alvaro Herrera 2003-09-29 00:05:25 +00:00
Peter Eisentraut feb4f44d29 Message editing: remove gratuitous variations in message wording, standardize
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
2003-09-25 06:58:07 +00:00
Tom Lane 9cb4a28f47 Improve error message for cp or rm failur during create/drop database,
per recent discussions.
2003-09-10 20:24:09 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane c4cf7fb814 Adjust 'permission denied' messages to be more useful and consistent. 2003-08-01 00:15:26 +00:00