Commit Graph

2431 Commits

Author SHA1 Message Date
Tom Lane 08dd23cec7 Fix some issues with temp/transient tables in extension scripts.
Phil Sorber reported that a rewriting ALTER TABLE within an extension
update script failed, because it creates and then drops a placeholder
table; the drop was being disallowed because the table was marked as an
extension member.  We could hack that specific case but it seems likely
that there might be related cases now or in the future, so the most
practical solution seems to be to create an exception to the general rule
that extension member objects can only be dropped by dropping the owning
extension.  To wit: if the DROP is issued within the extension's own
creation or update scripts, we'll allow it, implicitly performing an
"ALTER EXTENSION DROP object" first.  This will simplify cases such as
extension downgrade scripts anyway.

No docs change since we don't seem to have documented the idea that you
would need ALTER EXTENSION DROP for such an action to begin with.

Also, arrange for explicitly temporary tables to not get linked as
extension members in the first place, and the same for the magic
pg_temp_nnn schemas that are created to hold them.  This prevents assorted
unpleasant results if an extension script creates a temp table: the forced
drop at session end would either fail or remove the entire extension, and
neither of those outcomes is desirable.  Note that this doesn't fix the
ALTER TABLE scenario, since the placeholder table is not temp (unless the
table being rewritten is).

Back-patch to 9.1.
2012-03-08 15:53:09 -05:00
Tom Lane 0e5e167aae Collect and use element-frequency statistics for arrays.
This patch improves selectivity estimation for the array <@, &&, and @>
(containment and overlaps) operators.  It enables collection of statistics
about individual array element values by ANALYZE, and introduces
operator-specific estimators that use these stats.  In addition,
ScalarArrayOpExpr constructs of the forms "const = ANY/ALL (array_column)"
and "const <> ANY/ALL (array_column)" are estimated by treating them as
variants of the containment operators.

Since we still collect scalar-style stats about the array values as a
whole, the pg_stats view is expanded to show both these stats and the
array-style stats in separate columns.  This creates an incompatible change
in how stats for tsvector columns are displayed in pg_stats: the stats
about lexemes are now displayed in the array-related columns instead of the
original scalar-related columns.

There are a few loose ends here, notably that it'd be nice to be able to
suppress either the scalar-style stats or the array-element stats for
columns for which they're not useful.  But the patch is in good enough
shape to commit for wider testing.

Alexander Korotkov, reviewed by Noah Misch and Nathan Boley
2012-03-03 20:20:57 -05:00
Alvaro Herrera cb3a7c2b95 ALTER TABLE: skip FK validation when it's safe to do so
We already skip rewriting the table in these cases, but we still force a
whole table scan to validate the data.  This can be skipped, and thus
we can make the whole ALTER TABLE operation just do some catalog touches
instead of scanning the table, when these two conditions hold:

(a) Old and new pg_constraint.conpfeqop match exactly.  This is actually
stronger than needed; we could loosen things by way of operator
families, but it'd require a lot more effort.

(b) The functions, if any, implementing a cast from the foreign type to
the primary opcintype are the same.  For this purpose, we can consider a
binary coercion equivalent to an exact type match.  When the opcintype
is polymorphic, require that the old and new foreign types match
exactly.  (Since ri_triggers.c does use the executor, the stronger check
for polymorphic types is no mere future-proofing.  However, no core type
exercises its necessity.)

Author: Noah Misch

Committer's note: catalog version bumped due to change of the Constraint
node.  I can't actually find any way to have such a node in a stored
rule, but given that we have "out" support for them, better be safe.
2012-02-27 19:10:24 -03:00
Peter Eisentraut 66f0cf7da8 Remove useless const qualifier
Claiming that the typevar argument to DefineCompositeType() is const
was a plain lie.  A similar case in DefineVirtualRelation() was
already changed in passing in commit 1575fbcb.  Also clean up the now
unnecessary casts that used to cast away the const.
2012-02-26 15:22:27 +02:00
Peter Eisentraut 9cfd800aab Add some enumeration commas, for consistency 2012-02-24 11:04:45 +02:00
Tom Lane 891e6e7bfd Require execute permission on the trigger function for CREATE TRIGGER.
This check was overlooked when we added function execute permissions to the
system years ago.  For an ordinary trigger function it's not a big deal,
since trigger functions execute with the permissions of the table owner,
so they couldn't do anything the user issuing the CREATE TRIGGER couldn't
have done anyway.  However, if a trigger function is SECURITY DEFINER,
that is not the case.  The lack of checking would allow another user to
install it on his own table and then invoke it with, essentially, forged
input data; which the trigger function is unlikely to realize, so it might
do something undesirable, for instance insert false entries in an audit log
table.

Reported by Dinesh Kumar, patch by Robert Haas

Security: CVE-2012-0866
2012-02-23 15:38:56 -05:00
Peter Eisentraut c9d7004440 Remove inappropriate quotes
And adjust wording for consistency.
2012-02-23 12:52:17 +02:00
Robert Haas 2254367435 Make EXPLAIN (BUFFERS) track blocks dirtied, as well as those written.
Also expose the new counters through pg_stat_statements.

Patch by me.  Review by Fujii Masao and Greg Smith.
2012-02-22 20:33:05 -05:00
Robert Haas f74f9a277c Fix typo in comment.
Sandro Santilli
2012-02-22 19:46:12 -05:00
Alvaro Herrera a417f85e1d REASSIGN OWNED: Support foreign data wrappers and servers
This was overlooked when implementing those kinds of objects, in commit
cae565e503.

Per report from Pawel Casperek.
2012-02-22 17:33:12 -03:00
Tom Lane 4bfe68dfab Run a portal's cleanup hook immediately when pushing it to FAILED state.
This extends the changes of commit 6252c4f9e2
so that we run the cleanup hook earlier for failure cases as well as
success cases.  As before, the point is to avoid an assertion failure from
an Assert I added in commit a874fe7b4c, which
was meant to check that no user-written code can be called during portal
cleanup.  This fixes a case reported by Pavan Deolasee in which the Assert
could be triggered during backend exit (see the new regression test case),
and also prevents the possibility that the cleanup hook is run after
portions of the portal's state have already been recycled.  That doesn't
really matter in current usage, but it foreseeably could matter in the
future.

Back-patch to 9.1 where the Assert in question was added.
2012-02-15 16:19:01 -05:00
Robert Haas cd30728fb2 Allow LEAKPROOF functions for better performance of security views.
We don't normally allow quals to be pushed down into a view created
with the security_barrier option, but functions without side effects
are an exception: they're OK.  This allows much better performance in
common cases, such as when using an equality operator (that might
even be indexable).

There is an outstanding issue here with the CREATE FUNCTION / ALTER
FUNCTION syntax: there's no way to use ALTER FUNCTION to unset the
leakproof flag.  But I'm committing this as-is so that it doesn't
have to be rebased again; we can fix up the grammar in a future
commit.

KaiGai Kohei, with some wordsmithing by me.
2012-02-13 22:21:14 -05:00
Robert Haas af7914c662 Add TIMING option to EXPLAIN, to allow eliminating of timing overhead.
Sometimes it may be useful to get actual row counts out of EXPLAIN
(ANALYZE) without paying the cost of timing every node entry/exit.
With this patch, you can say EXPLAIN (ANALYZE, TIMING OFF) to get that.

Tomas Vondra, reviewed by Eric Theise, with minor doc changes by me.
2012-02-07 11:23:04 -05:00
Tom Lane 5fc78efcec Avoid throwing ERROR during WAL replay of DROP TABLESPACE.
Although we will not even issue an XLOG_TBLSPC_DROP WAL record unless
removal of the tablespace's directories succeeds, that does not guarantee
that the same operation will succeed during WAL replay.  Foreseeable
reasons for it to fail include temp files created in the tablespace by Hot
Standby backends, wrong directory permissions on a standby server, etc etc.
The original coding threw ERROR if replay failed to remove the directories,
but that is a serious overreaction.  Throwing an error aborts recovery,
and worse means that manual intervention will be needed to get the database
to start again, since otherwise the same error will recur on subsequent
attempts to replay the same WAL record.  And the consequence of failing to
remove the directories is only that some probably-small amount of disk
space is wasted, so it hardly seems justified to throw an error.
Accordingly, arrange to report such failures as LOG messages and keep going
when a failure occurs during replay.

Back-patch to 9.0 where Hot Standby was introduced.  In principle such
problems can occur in earlier releases, but Hot Standby increases the odds
of trouble significantly.  Given the lack of field reports of such issues,
I'm satisfied with patching back as far as the patch applies easily.
2012-02-06 14:44:41 -05:00
Tom Lane 17118825b8 Fix transient clobbering of shared buffers during WAL replay.
RestoreBkpBlocks was in the habit of zeroing and refilling the target
buffer; which was perfectly safe when the code was written, but is unsafe
during Hot Standby operation.  The reason is that we have coding rules
that allow backends to continue accessing a tuple in a heap relation while
holding only a pin on its buffer.  Such a backend could see transiently
zeroed data, if WAL replay had occasion to change other data on the page.
This has been shown to be the cause of bug #6425 from Duncan Rance (who
deserves kudos for developing a sufficiently-reproducible test case) as
well as Bridget Frey's re-report of bug #6200.  It most likely explains the
original report as well, though we don't yet have confirmation of that.

To fix, change the code so that only bytes that are supposed to change will
change, even transiently.  This actually saves cycles in RestoreBkpBlocks,
since it's not writing the same bytes twice.

Also fix seq_redo, which has the same disease, though it has to work a bit
harder to meet the requirement.

So far as I can tell, no other WAL replay routines have this type of bug.
In particular, the index-related replay routines, which would certainly be
broken if they had to meet the same standard, are not at risk because we
do not have coding rules that allow access to an index page when not
holding a buffer lock on it.

Back-patch to 9.0 where Hot Standby was added.
2012-02-05 15:49:17 -05:00
Andrew Dunstan 39909d1d39 Add array_to_json and row_to_json functions.
Also move the escape_json function from explain.c to json.c where it
seems to belong.

Andrew Dunstan, Reviewd by Abhijit Menon-Sen.
2012-02-03 12:11:16 -05:00
Robert Haas 5384a73f98 Built-in JSON data type.
Like the XML data type, we simply store JSON data as text, after checking
that it is valid.  More complex operations such as canonicalization and
comparison may come later, but this is enough for not.

There are a few open issues here, such as whether we should attempt to
detect UTF-8 surrogate pairs represented as \uXXXX\uYYYY, but this gets
the basic framework in place.
2012-01-31 11:48:23 -05:00
Heikki Linnakangas a578257040 Accept a non-existent value in "ALTER USER/DATABASE SET ..." command.
When default_text_search_config, default_tablespace, or temp_tablespaces
setting is set per-user or per-database, with an "ALTER USER/DATABASE SET
..." statement, don't throw an error if the text search configuration or
tablespace does not exist. In case of text search configuration, even if
it doesn't exist in the current database, it might exist in another
database, where the setting is intended to have its effect. This behavior
is now the same as search_path's.

Tablespaces are cluster-wide, so the same argument doesn't hold for
tablespaces, but there's a problem with pg_dumpall: it dumps "ALTER USER
SET ..." statements before the "CREATE TABLESPACE" statements. Arguably
that's pg_dumpall's fault - it should dump the statements in such an order
that the tablespace is created first and then the "ALTER USER SET
default_tablespace ..." statements after that - but it seems better to be
consistent with search_path and default_text_search_config anyway. Besides,
you could still create a dump that throws an error, by creating the
tablespace, running "ALTER USER SET default_tablespace", then dropping the
tablespace and running pg_dumpall on that.

Backpatch to all supported versions.
2012-01-30 11:13:36 +02:00
Peter Eisentraut 2787458362 Disallow ALTER DOMAIN on non-domain type everywhere
This has been the behavior already in most cases, but through
omission, ALTER DOMAIN / OWNER TO and ALTER DOMAIN / SET SCHEMA would
silently work on non-domain types as well.
2012-01-27 21:20:34 +02:00
Robert Haas 2d1371d3ee Be more clear when a new column name collides with a system column name.
We now use the same error message for ALTER TABLE .. ADD COLUMN or
ALTER TABLE .. RENAME COLUMN that we do for CREATE TABLE.  The old
message was accurate, but might be confusing to users not aware of our
system columns.

Vik Reykja, with some changes by me, and further proofreading by Tom Lane
2012-01-26 12:44:30 -05:00
Robert Haas 0e549697d1 Classify DROP operations by whether or not they are user-initiated.
This doesn't do anything useful just yet, but is intended as supporting
infrastructure for allowing sepgsql to sensibly check DROP permissions.

KaiGai Kohei and Robert Haas
2012-01-26 09:30:27 -05:00
Robert Haas 9d35116611 Damage control for yesterday's CheckIndexCompatible changes.
Rip out a regression test that doesn't play well with settings put in
place by the build farm, and rewrite the code in CheckIndexCompatible
in a hopefully more transparent style.
2012-01-26 08:21:31 -05:00
Robert Haas 9f9135d129 Instrument index-only scans to count heap fetches performed.
Patch by me; review by Tom Lane, Jeff Davis, and Peter Geoghegan.
2012-01-25 20:41:52 -05:00
Robert Haas 6eb71ac552 Make CheckIndexCompatible simpler and more bullet-proof.
This gives up the "don't rewrite the index" behavior in a couple of
relatively unimportant cases, such as changing between an array type
and an unconstrained domain over that array type, in return for
making this code more future-proof.

Noah Misch
2012-01-25 15:28:07 -05:00
Alvaro Herrera 74ab96a45e Add pg_trigger_depth() function
This reports the depth level of triggers currently in execution, or zero
if not called from inside a trigger.

No catversion bump in this patch, but you have to initdb if you want
access to the new function.

Author: Kevin Grittner
2012-01-25 13:22:54 -03:00
Simon Riggs b8a91d9d1c ALTER <thing> [IF EXISTS] ... allows silent DDL if required,
e.g. ALTER FOREIGN TABLE IF EXISTS foo RENAME TO bar

Pavel Stehule
2012-01-23 23:25:04 +00:00
Magnus Hagander ae137bcaab Fix warning about unused variable 2012-01-18 10:24:15 +01:00
Alvaro Herrera 3b11247aad Disallow merging ONLY constraints in children tables
When creating a child table, or when attaching an existing table as
child of another, we must not allow inheritable constraints to be
merged with non-inheritable ones, because then grandchildren would not
properly get the constraint.  This would violate the grandparent's
expectations.

Bugs noted by Robert Haas.

Author: Nikhil Sontakke
2012-01-16 19:27:05 -03:00
Robert Haas 1575fbcb79 Prevent adding relations to a concurrently dropped schema.
In the previous coding, it was possible for a relation to be created
via CREATE TABLE, CREATE VIEW, CREATE SEQUENCE, CREATE FOREIGN TABLE,
etc.  in a schema while that schema was meanwhile being concurrently
dropped.  This led to a pg_class entry with an invalid relnamespace
value.  The same problem could occur if a relation was moved using
ALTER .. SET SCHEMA while the target schema was being concurrently
dropped.  This patch prevents both of those scenarios by locking the
schema to which the relation is being added using AccessShareLock,
which conflicts with the AccessExclusiveLock taken by DROP.

As a desirable side effect, this also prevents the use of CREATE OR
REPLACE VIEW to queue for an AccessExclusiveLock on a relation on which
you have no rights: that will now fail immediately with a permissions
error, before trying to obtain a lock.

We need similar protection for all other object types, but as everything
other than relations uses a slightly different set of code paths, I'm
leaving that for a separate commit.

Original complaint (as far as I could find) about CREATE by Nikhil
Sontakke; risk for ALTER .. SET SCHEMA pointed out by Tom Lane;
further details by Dan Farina; patch by me; review by Hitoshi Harada.
2012-01-16 09:49:34 -05:00
Heikki Linnakangas 00c5f55061 Make superuser imply replication privilege. The idea of a privilege that
superuser doesn't have doesn't make much sense, as a superuser can do
whatever he wants through other means, anyway. So instead of granting
replication privilege to superusers in CREATE USER time by default, allow
replication connection from superusers whether or not they have the
replication privilege.

Patch by Noah Misch, per discussion on bug report #6264
2012-01-14 18:22:16 +02:00
Robert Haas d0dcb315db Fix broken logic in lazy_vacuum_heap.
As noted by Tom Lane, the previous coding in this area, which I
introduced in commit bbb6e559c4, was
poorly tested and caused the vacuum's second heap to go into what would
have been an infinite loop but for the fact that it eventually caused a
memory allocation failure.  This version seems to work better.
2012-01-13 08:22:31 -05:00
Tom Lane 21b446dd09 Fix CLUSTER/VACUUM FULL for toast values owned by recently-updated rows.
In commit 7b0d0e9356, I made CLUSTER and
VACUUM FULL try to preserve toast value OIDs from the original toast table
to the new one.  However, if we have to copy both live and recently-dead
versions of a row that has a toasted column, those versions may well
reference the same toast value with the same OID.  The patch then led to
duplicate-key failures as we tried to insert the toast value twice with the
same OID.  (The previous behavior was not very desirable either, since it
would have silently inserted the same value twice with different OIDs.
That wastes space, but what's worse is that the toast values inserted for
already-dead heap rows would not be reclaimed by subsequent ordinary
VACUUMs, since they go into the new toast table marked live not deleted.)

To fix, check if the copied OID already exists in the new toast table, and
if so, assume that it stores the desired value.  This is reasonably safe
since the only case where we will copy an OID from a previous toast pointer
is when toast_insert_or_update was given that toast pointer and so we just
pulled the data from the old table; if we got two different values that way
then we have big problems anyway.  We do have to assume that no other
backend is inserting items into the new toast table concurrently, but
that's surely safe for CLUSTER and VACUUM FULL.

Per bug #6393 from Maxim Boguk.  Back-patch to 9.0, same as the previous
patch.
2012-01-12 16:40:14 -05:00
Robert Haas df970a0ac8 Fix backwards logic in previous commit.
I wrote this code before committing it, but managed not to include it in
the actual commit.
2012-01-06 22:54:43 -05:00
Robert Haas 1489e2f26a Improve behavior of concurrent ALTER TABLE, and do some refactoring.
ALTER TABLE (and ALTER VIEW, ALTER SEQUENCE, etc.) now use a
RangeVarGetRelid callback to check permissions before acquiring a table
lock.  We also now use the same callback for all forms of ALTER TABLE,
rather than having separate, almost-identical callbacks for ALTER TABLE
.. SET SCHEMA and ALTER TABLE .. RENAME, and no callback at all for
everything else.

I went ahead and changed the code so that no form of ALTER TABLE works
on foreign tables; you must use ALTER FOREIGN TABLE instead.  In 9.1,
it was possible to use ALTER TABLE .. SET SCHEMA or ALTER TABLE ..
RENAME on a foreign table, but not any other form of ALTER TABLE, which
did not seem terribly useful or consistent.

Patch by me; review by Noah Misch.
2012-01-06 22:42:26 -05:00
Peter Eisentraut 104e7dac28 Improve ALTER DOMAIN / DROP CONSTRAINT with nonexistent constraint
ALTER DOMAIN / DROP CONSTRAINT on a nonexistent constraint name did
not report any error.  Now it reports an error.  The IF EXISTS option
was added to get the usual behavior of ignoring nonexistent objects to
drop.
2012-01-05 19:48:55 +02:00
Bruce Momjian e126958c2e Update copyright notices for year 2012. 2012-01-01 18:01:58 -05:00
Robert Haas 0e4611c023 Add a security_barrier option for views.
When a view is marked as a security barrier, it will not be pulled up
into the containing query, and no quals will be pushed down into it,
so that no function or operator chosen by the user can be applied to
rows not exposed by the view.  Views not configured with this
option cannot provide robust row-level security, but will perform far
better.

Patch by KaiGai Kohei; original problem report by Heikki Linnakangas
(in October 2009!).  Review (in earlier versions) by Noah Misch and
others.  Design advice by Tom Lane and myself.  Further review and
cleanup by me.
2011-12-22 16:16:31 -05:00
Peter Eisentraut f90dd28062 Add ALTER DOMAIN ... RENAME
You could already rename domains using ALTER TYPE, but with this new
command it is more consistent with how other commands treat domains as
a subcategory of types.
2011-12-22 22:43:56 +02:00
Tom Lane c31224e257 Update per-column ACLs, not only per-table ACL, when changing table owner.
We forgot to modify column ACLs, so privileges were still shown as having
been granted by the old owner.  This meant that neither the new owner nor
a superuser could revoke the now-untraceable-to-table-owner permissions.
Per bug #6350 from Marc Balmer.

This has been wrong since column ACLs were added, so back-patch to 8.4.
2011-12-21 18:23:11 -05:00
Robert Haas cbe24a6dd8 Improve behavior of concurrent CLUSTER.
In the previous coding, a user could queue up for an AccessExclusiveLock
on a table they did not have permission to cluster, thus potentially
interfering with access by authorized users who got stuck waiting behind
the AccessExclusiveLock.  This approach avoids that.  cluster() has the
same permissions-checking requirements as REINDEX TABLE, so this commit
moves the now-shared callback to tablecmds.c and renames it, per
discussion with Noah Misch.
2011-12-21 15:17:28 -05:00
Robert Haas d573e239f0 Take fewer snapshots.
When a PORTAL_ONE_SELECT query is executed, we can opportunistically
reuse the parse/plan shot for the execution phase.  This cuts down the
number of snapshots per simple query from 2 to 1 for the simple
protocol, and 3 to 2 for the extended protocol.  Since we are only
reusing a snapshot taken early in the processing of the same protocol
message, the change shouldn't be user-visible, except that the remote
possibility of the planning and execution snapshots being different is
eliminated.

Note that this change does not make it safe to assume that the parse/plan
snapshot will certainly be reused; that will currently only happen if
PortalStart() decides to use the PORTAL_ONE_SELECT strategy.  It might
be worth trying to provide some stronger guarantees here in the future,
but for now we don't.

Patch by me; review by Dimitri Fontaine.
2011-12-21 09:16:55 -05:00
Peter Eisentraut 729205571e Add support for privileges on types
This adds support for the more or less SQL-conforming USAGE privilege
on types and domains.  The intent is to be able restrict which users
can create dependencies on types, which restricts the way in which
owners can alter types.

reviewed by Yeb Havinga
2011-12-20 00:05:19 +02:00
Alvaro Herrera 61d81bd28d Allow CHECK constraints to be declared ONLY
This makes them enforceable only on the parent table, not on children
tables.  This is useful in various situations, per discussion involving
people bitten by the restrictive behavior introduced in 8.4.

Message-Id:
8762mp93iw.fsf@comcast.net
CAFaPBrSMMpubkGf4zcRL_YL-AERUbYF_-ZNNYfb3CVwwEqc9TQ@mail.gmail.com

Authors: Nikhil Sontakke, Alex Hunsaker
Reviewed by Robert Haas and myself
2011-12-19 17:30:23 -03:00
Robert Haas 1da5c11959 Improve behavior of concurrent ALTER <relation> .. SET SCHEMA.
If the referrent of a name changes while we're waiting for the lock,
we must recheck permissons.  We also now check the relkind before
locking, since it's easy to do that long the way.

Patch by me; review by Noah Misch.
2011-12-15 19:02:58 -05:00
Robert Haas 74a1d4fe7c Improve behavior of concurrent rename statements.
Previously, renaming a table, sequence, view, index, foreign table,
column, or trigger checked permissions before locking the object, which
meant that if permissions were revoked during the lock wait, we would
still allow the operation.  Similarly, if the original object is dropped
and a new one with the same name is created, the operation will be allowed
if we had permissions on the old object; the permissions on the new
object don't matter.  All this is now fixed.

Along the way, attempting to rename a trigger on a foreign table now gives
the same error message as trying to create one there in the first place
(i.e. that it's not a table or view) rather than simply stating that no
trigger by that name exists.

Patch by me; review by Noah Misch.
2011-12-15 19:02:38 -05:00
Peter Eisentraut 5bcf8ede45 Add ALTER FOREIGN DATA WRAPPER / RENAME and ALTER SERVER / RENAME 2011-12-09 20:42:30 +02:00
Magnus Hagander 16d8e594ac Remove spclocation field from pg_tablespace
Instead, add a function pg_tablespace_location(oid) used to return
the same information, and do this by reading the symbolic link.

Doing it this way makes it possible to relocate a tablespace when the
database is down by simply changing the symbolic link.
2011-12-07 10:37:33 +01:00
Tom Lane c6e3ac11b6 Create a "sort support" interface API for faster sorting.
This patch creates an API whereby a btree index opclass can optionally
provide non-SQL-callable support functions for sorting.  In the initial
patch, we only use this to provide a directly-callable comparator function,
which can be invoked with a bit less overhead than the traditional
SQL-callable comparator.  While that should be of value in itself, the real
reason for doing this is to provide a datatype-extensible framework for
more aggressive optimizations, as in Peter Geoghegan's recent work.

Robert Haas and Tom Lane
2011-12-07 00:19:39 -05:00
Robert Haas d2a662182e Typo fixes for commit 2ad36c4e44.
Noted during post-commit review by by Noah Misch.
2011-12-06 15:50:02 -05:00
Robert Haas 2ad36c4e44 Improve table locking behavior in the face of current DDL.
In the previous coding, callers were faced with an awkward choice:
look up the name, do permissions checks, and then lock the table; or
look up the name, lock the table, and then do permissions checks.
The first choice was wrong because the results of the name lookup
and permissions checks might be out-of-date by the time the table
lock was acquired, while the second allowed a user with no privileges
to interfere with access to a table by users who do have privileges
(e.g. if a malicious backend queues up for an AccessExclusiveLock on
a table on which AccessShareLock is already held, further attempts
to access the table will be blocked until the AccessExclusiveLock
is obtained and the malicious backend's transaction rolls back).

To fix, allow callers of RangeVarGetRelid() to pass a callback which
gets executed after performing the name lookup but before acquiring
the relation lock.  If the name lookup is retried (because
invalidation messages are received), the callback will be re-executed
as well, so we get the best of both worlds.  RangeVarGetRelid() is
renamed to RangeVarGetRelidExtended(); callers not wishing to supply
a callback can continue to invoke it as RangeVarGetRelid(), which is
now a macro.  Since the only one caller that uses nowait = true now
passes a callback anyway, the RangeVarGetRelid() macro defaults nowait
as well.  The callback can also be used for supplemental locking - for
example, REINDEX INDEX needs to acquire the table lock before the index
lock to reduce deadlock possibilities.

There's a lot more work to be done here to fix all the cases where this
can be a problem, but this commit provides the general infrastructure
and fixes the following specific cases: REINDEX INDEX, REINDEX TABLE,
LOCK TABLE, and and DROP TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE.

Per discussion with Noah Misch and Alvaro Herrera.
2011-11-30 10:27:00 -05:00