Commit Graph

3246 Commits

Author SHA1 Message Date
Simon Riggs 722bf7017b Extend ALTER TABLE to allow Foreign Keys to be added without initial validation.
FK constraints that are marked NOT VALID may later be VALIDATED, which uses an
ShareUpdateExclusiveLock on constraint table and RowShareLock on referenced
table. Significantly reduces lock strength and duration when adding FKs.
New state visible from psql.

Simon Riggs, with reviews from Marko Tiikkaja and Robert Haas
2011-02-08 12:23:20 +00:00
Heikki Linnakangas dafaa3efb7 Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.

To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.

A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.

Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.

We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.

Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.

Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-08 00:09:08 +02:00
Magnus Hagander 76129e7f14 Include more status information in walsender results
Add the current xlog insert location to the response of
IDENTIFY_SYSTEM, and adds result sets containing start
and stop location of backups to BASE_BACKUP responses.
2011-02-03 13:46:23 +01:00
Heikki Linnakangas 32866837f0 Fix typo 2011-01-31 18:29:38 +02:00
Magnus Hagander 507069de6d Add option to include WAL in base backup
When included, this makes the base backup a complete working
"clone" of the initial database, ready to have a postmaster
started against it without the need to set up any log archiving
or similar.

Magnus Hagander, reviewed by Fujii Masao and Heikki Linnakangas
2011-01-30 21:30:09 +01:00
Heikki Linnakangas 1e4baa5c96 Update psql's \copyright to match the text we have in the COPYRIGHT file. 2011-01-27 20:20:49 +02:00
Peter Eisentraut 77ff840835 Document the "S" option for psql's \dn command in the psql help
This option was recently introduced, but the documentation in help.c
was not updated.
2011-01-25 01:51:35 +02:00
Heikki Linnakangas 74be35b07c Fix typo in the psql \d query handling, so that we use the correct query
against 9.0 servers.
2011-01-24 14:34:15 +02:00
Heikki Linnakangas 56d77c9e56 Silence compiler warning about uninitialized variable, noted by
Itagaki Takahiro
2011-01-24 08:28:35 +02:00
Magnus Hagander 39e911e28a Reorder includes to unbreak MSVC 2011-01-23 22:44:07 +01:00
Heikki Linnakangas 7f508f1c6b Add 'directory' format to pg_dump. The new directory format is compatible
with the 'tar' format, in that untarring a tar format archive produces a
valid directory format archive.

Joachim Wieland and Heikki Linnakangas
2011-01-23 23:10:15 +02:00
Tom Lane f36920796e Fix another portability issue in pg_basebackup.
The target of sscanf with a %o format had better be of integer width,
but "mode_t" conceivably isn't that.  Another compiler warning seen
only on some platforms; this one I think is potentially a real bug
and not just a warning.
2011-01-23 14:26:51 -05:00
Tom Lane 10e99f15d4 Add .gitignore file to silence complaints about pg_basebackup. 2011-01-23 13:07:34 -05:00
Tom Lane b3cfcdaad2 Suppress uninitialized-variable warning. 2011-01-23 13:06:38 -05:00
Magnus Hagander d13e0975c9 Use pg_strcasecmp instead of strcasecmp for portability
Per buildfarm.
2011-01-23 17:35:02 +01:00
Magnus Hagander fe12263c9f filemode is parsed on win32 even if never used
Per buildfarm failure.
2011-01-23 14:45:23 +01:00
Magnus Hagander 048d148fe6 Add pg_basebackup tool for streaming base backups
This tool makes it possible to do the pg_start_backup/
copy files/pg_stop_backup step in a single command.

There are still some steps to be done before this is a
complete backup solution, such as the ability to stream
the required WAL logs, but it's still usable, and
could do with some buildfarm coverage.

In passing, make the checkpoint request optionally
fast instead of hardcoding it.

Magnus Hagander, reviewed by Fujii Masao and Dimitri Fontaine
2011-01-23 12:21:23 +01:00
Tom Lane e2627258c3 Suppress possibly-uninitialized-variable warnings from gcc 4.5.
It appears that gcc 4.5 can issue such warnings for whole structs, not
just scalar variables as in the past.  Refactor some pg_dump code slightly
so that the OutputContext local variables are always initialized, even
if they won't be used.  It's cheap enough to not be worth worrying about.
2011-01-22 17:56:42 -05:00
Robert Haas 9c5e2c120b Add new psql command \dL to list languages.
Original patch by Fernando Ike, revived by Josh Kuperschmidt, reviewed by Andreas
Karlsson, and in earlier versions by Tom Lane and Peter Eisentraut.
2011-01-20 00:00:30 -05:00
Tom Lane 52948169bc Code review for postmaster.pid contents changes.
Fix broken test for pre-existing postmaster, caused by wrong code for
appending lines to the lockfile; don't write a failed listen_address
setting into the lockfile; don't arbitrarily change the location of the
data directory in the lockfile compared to previous releases; provide more
consistent and useful definitions of the socket path and listen_address
entries; avoid assuming that pg_ctl has the same DEFAULT_PGSOCKET_DIR as
the postmaster; assorted code style improvements.
2011-01-13 19:01:28 -05:00
Bruce Momjian d8d3d2a4f3 Fix pg_upgrade of large object permissions by preserving pg_auth.oid,
which is stored in pg_largeobject_metadata.

No backpatch to 9.0 because you can't migrate from 9.0 to 9.0 with the
same catversion (because of tablespace conflict), and a pre-9.0
migration to 9.0 has not large object permissions to migrate.
2011-01-07 21:59:29 -05:00
Bruce Momjian 2896c87ce4 Force pg_upgrade's to preserve pg_class.oid, not pg_class.relfilenode.
Toast tables have identical pg_class.oid and pg_class.relfilenode, but
for clarity it is good to preserve the pg_class.oid.

Update comments regarding what is preserved, and do some
variable/function renaming for clarity.
2011-01-07 21:26:13 -05:00
Bruce Momjian 5cff5b5779 Clarify pg_upgrade's creation of the map file structure. Also clean
up pg_dump's calling of pg_upgrade_support functions.
2011-01-05 11:37:08 -05:00
Itagaki Takahiro 14158f25cd Improve psql tab completion for CREATE/ALTER ROLE [NO]REPLICATION.
Missing support for VALID UNTIL in CREATE ROLE is also added.
2011-01-04 17:56:01 +09:00
Robert Haas 0d692a0dc9 Basic foreign table support.
Foreign tables are a core component of SQL/MED.  This commit does
not provide a working SQL/MED infrastructure, because foreign tables
cannot yet be queried.  Support for foreign table scans will need to
be added in a future patch.  However, this patch creates the necessary
system catalog structure, syntax support, and support for ancillary
operations such as COMMENT and SECURITY LABEL.

Shigeru Hanada, heavily revised by Robert Haas
2011-01-01 23:48:11 -05:00
Robert Haas d7acf6cc4a Fix pg_dump support for security labels on columns.
Along the way, correct an erroneous comment.
2011-01-01 17:44:28 -05:00
Bruce Momjian 92a73d2190 Add #include <time.h> to pg_ctl.c to fix compiler warning. 2011-01-01 15:55:36 -05:00
Bruce Momjian 5d950e3b0c Stamp copyrights for year 2011. 2011-01-01 13:18:15 -05:00
Bruce Momjian 30aeda4394 Include the first valid listen address in pg_ctl to improve server start
"wait" detection and add postmaster start time to help determine if the
postmaster is actually using the specified data directory.
2010-12-31 17:25:02 -05:00
Robert Haas 53dbc27c62 Support unlogged tables.
The contents of an unlogged table are WAL-logged; thus, they are not
available on standby servers and are truncated whenever the database
system enters recovery.  Indexes on unlogged tables are also unlogged.
Unlogged GiST indexes are not currently supported.
2010-12-29 06:48:53 -05:00
Magnus Hagander 9b8aff8c19 Add REPLICATION privilege for ROLEs
This privilege is required to do Streaming Replication, instead of
superuser, making it possible to set up a SR slave that doesn't
have write permissions on the master.

Superuser privileges do NOT override this check, so in order to
use the default superuser account for replication it must be
explicitly granted the REPLICATION permissions. This is backwards
incompatible change, in the interest of higher default security.
2010-12-29 11:05:03 +01:00
Andrew Dunstan 04ee0db6b2 Allow vpath builds and regression tests to succeed on Mingw. Backpatch to release 8.4 - earlier releases would require more changes and it's not worth the trouble. 2010-12-24 13:31:28 -05:00
Bruce Momjian 075354ad1b Improve "pg_ctl -w start" server detection by writing the postmaster
port and socket directory into postmaster.pid, and have pg_ctl read from
that file, for use by PQping().
2010-12-24 09:45:52 -05:00
Robert Haas 9878e295dc Improved tab completion for views with triggers.
Allow INSERT INTO, UPDATE, and DELETE FROM to be completed with
either the name of a table (as before) or the name of a view with
an appropriate INSTEAD OF rule.

Along the way, allow CREATE TRIGGER to be completed with INSTEAD OF,
as well as BEFORE and AFTER.

David Fetter, reviewed by Itagaki Takahiro
2010-12-13 22:46:55 -05:00
Tom Lane 671199929d Move a couple of initdb's subroutines into src/port/.
mkdir_p and check_data_dir will be useful in CREATE TABLESPACE, since we
have agreed that that command should handle subdirectory creation just like
initdb creates the PGDATA directory.  Push them into src/port/ so that they
are available to both initdb and the backend.  Rename to pg_mkdir_p and
pg_check_dir, just to be on the safe side.  Add FreeBSD's copyright notice
to pgmkdirp.c, since that's where the code came from originally (this
really should have been in initdb.c).  Very marginal code/comment cleanup.
2010-12-10 19:42:44 -05:00
Tom Lane 04f4e10cfc Use symbolic names not octal constants for file permission flags.
Purely cosmetic patch to make our coding standards more consistent ---
we were doing symbolic some places and octal other places.  This patch
fixes all C-coded uses of mkdir, chmod, and umask.  There might be some
other calls I missed.  Inconsistency noted while researching tablespace
directory permissions issue.
2010-12-10 17:35:33 -05:00
Tom Lane 663fc32e26 Eliminate O(N^2) behavior in parallel restore with many blobs.
With hundreds of thousands of TOC entries, the repeated searches in
reduce_dependencies() become the dominant cost.  Get rid of that searching
by constructing reverse-dependency lists, which we can do in O(N) time
during the fix_dependencies() preprocessing.  I chose to store the reverse
dependencies as DumpId arrays for consistency with the forward-dependency
representation, and keep the previously-transient tocsByDumpId[] array
around to locate actual TOC entry structs quickly from dump IDs.

While this fixes the slow case reported by Vlad Arkhipov, there is still
a potential for O(N^2) behavior with sufficiently many tables:
fix_dependencies itself, as well as mark_create_done and
inhibit_data_for_failed_table, are doing repeated searches to deal with
table-to-table-data dependencies.  Possibly this work could be extended
to deal with that, although the latter two functions are also used in
non-parallel restore where we currently don't run fix_dependencies.

Another TODO is that we fail to parallelize restore of multiple blobs
at all.  This appears to require changes in the archive format to fix.

Back-patch to 9.0 where the problem was reported.  8.4 has potential issues
as well; but since it doesn't create a separate TOC entry for each blob,
it's at much less risk of having enough TOC entries to cause real problems.
2010-12-09 13:03:11 -05:00
Heikki Linnakangas 9cea52a5a3 Remove misleading comments. Move _Clone and _DeClone functions before
the "END OF FORMAT CALLBACKS" comment, because they are format callbacks too.
2010-12-03 14:58:24 +02:00
Alvaro Herrera d7e5d151da Move private struct declaration to compress_io.c
Keep only the typedef in the header file.
2010-12-02 17:45:13 -03:00
Alvaro Herrera 0025b76f4f Remove trailing whitespace 2010-12-02 17:45:13 -03:00
Alvaro Herrera d67a39c326 Remove useless struct declaration 2010-12-02 17:45:12 -03:00
Alvaro Herrera 7f4a7af2fd Silence compiler 2010-12-02 17:45:12 -03:00
Heikki Linnakangas bf9aa490db Refactor the pg_dump zlib code from pg_backup_custom.c to a separate file,
to make it easier to reuse that code. There is no user-visible changes.

This is in preparation for the patch to add a new archive format, a directory,
to perform a custom-like dump but with each table being dumped to a separate
file (that in turn is a prerequisite for parallel pg_dump). This also makes it
easier to add new compression methods in the future, and makes the
pg_backup_custom.c code easier to read, when the compression-related code is
factored out.

Joachim Wieland, with heavy editorialization by me.
2010-12-02 21:39:03 +02:00
Tom Lane db96e1ccfc Rewrite PQping to be more like what we agreed to last week.
Basically, we want to distinguish all cases where the connection was
not made from those where it was.  A convenient proxy for this is to
see if we got a message with a SQLSTATE code back from the postmaster.
This presumes that the postmaster will always send us a SQLSTATE in
a failure message, which is true for 7.4 and later postmasters in
every case except fork failure.  (We could possibly complicate the
postmaster code to do something about that, but it seems not worth
the trouble, especially since pg_ctl's response for that case should
be to keep waiting anyway.)

If we did get a SQLSTATE from the postmaster, there are basically only
two cases, as per last week's discussion: ERRCODE_CANNOT_CONNECT_NOW
and everything else.  Any other error code implies that the postmaster
is in principle willing to accept connections, it just didn't like or
couldn't handle this particular request.  We want to make a special
case for ERRCODE_CANNOT_CONNECT_NOW so that "pg_ctl start -w" knows
it should keep waiting.

In passing, pick names for the enum constants that are a tad less
likely to present collision hazards in future.
2010-11-27 01:30:34 -05:00
Robert Haas 55109313f9 Add more ALTER <object> .. SET SCHEMA commands.
This adds support for changing the schema of a conversion, operator,
operator class, operator family, text search configuration, text search
dictionary, text search parser, or text search template.

Dimitri Fontaine, with assorted corrections and other kibitzing.
2010-11-26 17:31:54 -05:00
Bruce Momjian 5f4b3d750b Improve pg_ctl "cannot connect" spacing, per Tom, and wording. 2010-11-26 10:04:18 -05:00
Bruce Momjian 4646e0cef7 Improve pg_ctl "cannot connect" warning, per suggestion from Magnus. 2010-11-25 14:38:20 -05:00
Bruce Momjian afd7d9adca Add PQping and PQpingParams to libpq to allow detection of the server's
status, including a status where the server is running but refuses a
postgres connection.

Have pg_ctl use this new function.  This fixes the case where pg_ctl
reports that the server is not running (cannot connect) but in fact it
is running.
2010-11-25 13:09:38 -05:00
Tom Lane 725d52d0c2 Create the system catalog infrastructure needed for KNNGIST.
This commit adds columns amoppurpose and amopsortfamily to pg_amop, and
column amcanorderbyop to pg_am.  For the moment all the entries in
amcanorderbyop are "false", since the underlying support isn't there yet.

Also, extend the CREATE OPERATOR CLASS/ALTER OPERATOR FAMILY commands with
[ FOR SEARCH | FOR ORDER BY sort_operator_family ] clauses to allow the new
columns of pg_amop to be populated, and create pg_dump support for dumping
that information.

I also added some documentation, although it's perhaps a bit premature
given that the feature doesn't do anything useful yet.

Teodor Sigaev, Robert Haas, Tom Lane
2010-11-24 14:22:17 -05:00
Peter Eisentraut fc946c39ae Remove useless whitespace at end of lines 2010-11-23 22:34:55 +02:00