Commit Graph

527 Commits

Author SHA1 Message Date
Bruce Momjian 807b9e0dff pgindent run for 9.5 2015-05-23 21:35:49 -04:00
Tom Lane 0bb8528b5c Fix postgres_fdw to return the right ctid value in EvalPlanQual cases.
If a postgres_fdw foreign table is a non-locked source relation in an
UPDATE, DELETE, or SELECT FOR UPDATE/SHARE, and the query selects its
ctid column, the wrong value would be returned if an EvalPlanQual
recheck occurred.  This happened because the foreign table's result row
was copied via the ROW_MARK_COPY code path, and EvalPlanQualFetchRowMarks
just unconditionally set the reconstructed tuple's t_self to "invalid".

To fix that, we can have EvalPlanQualFetchRowMarks copy the composite
datum's t_ctid field, and be sure to initialize that along with t_self
when postgres_fdw constructs a tuple to return.

If we just did that much then EvalPlanQualFetchRowMarks would start
returning "(0,0)" as ctid for all other ROW_MARK_COPY cases, which perhaps
does not matter much, but then again maybe it might.  The cause of that is
that heap_form_tuple, which is the ultimate source of all composite datums,
simply leaves t_ctid as zeroes in newly constructed tuples.  That seems
like a bad idea on general principles: a field that's really not been
initialized shouldn't appear to have a valid value.  So let's eat the
trivial additional overhead of doing "ItemPointerSetInvalid(&(td->t_ctid))"
in heap_form_tuple.

This closes out our handling of Etsuro Fujita's report that tableoid and
ctid weren't correctly set in postgres_fdw EvalPlanQual cases.  Along the
way we did a great deal of work to improve FDWs' ability to control row
locking behavior; which was not wasted effort by any means, but it didn't
end up being a fix for this problem because that feature would be too
expensive for postgres_fdw to use all the time.

Although the fix for the tableoid misbehavior was back-patched, I'm
hesitant to do so here; it seems far less likely that people would care
about remote ctid than tableoid, and even such a minor behavioral change
as this in heap_form_tuple is perhaps best not back-patched.  So commit
to HEAD only, at least for the moment.

Etsuro Fujita, with some adjustments by me
2015-05-13 14:05:29 -04:00
Tom Lane afb9249d06 Add support for doing late row locking in FDWs.
Previously, FDWs could only do "early row locking", that is lock a row as
soon as it's fetched, even though local restriction/join conditions might
discard the row later.  This patch adds callbacks that allow FDWs to do
late locking in the same way that it's done for regular tables.

To make use of this feature, an FDW must support the "ctid" column as a
unique row identifier.  Currently, since ctid has to be of type TID,
the feature is of limited use, though in principle it could be used by
postgres_fdw.  We may eventually allow FDWs to specify another data type
for ctid, which would make it possible for more FDWs to use this feature.

This commit does not modify postgres_fdw to use late locking.  We've
tested some prototype code for that, but it's not in committable shape,
and besides it's quite unclear whether it actually makes sense to do late
locking against a remote server.  The extra round trips required are likely
to outweigh any benefit from improved concurrency.

Etsuro Fujita, reviewed by Ashutosh Bapat, and hacked up a lot by me
2015-05-12 14:10:17 -04:00
Andres Freund 168d5805e4 Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint.  DO NOTHING avoids the
constraint violation, without touching the pre-existing row.  DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed.  The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.

This feature is often referred to as upsert.

This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert.  If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made.  If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.

To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.

Bumps catversion as stored rules change.

Author: Peter Geoghegan, with significant contributions from Heikki
    Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
    Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:43:10 +02:00
Andres Freund 2c8f4836db Represent columns requiring insert and update privileges indentently.
Previously, relation range table entries used a single Bitmapset field
representing which columns required either UPDATE or INSERT privileges,
despite the fact that INSERT and UPDATE privileges are separately
cataloged, and may be independently held.  As statements so far required
either insert or update privileges but never both, that was
sufficient. The required permission could be inferred from the top level
statement run.

The upcoming INSERT ... ON CONFLICT UPDATE feature needs to
independently check for both privileges in one statement though, so that
is not sufficient anymore.

Bumps catversion as stored rules change.

Author: Peter Geoghegan
Reviewed-By: Andres Freund
2015-05-08 00:20:46 +02:00
Robert Haas 924bcf4f16 Create an infrastructure for parallel computation in PostgreSQL.
This does four basic things.  First, it provides convenience routines
to coordinate the startup and shutdown of parallel workers.  Second,
it synchronizes various pieces of state (e.g. GUCs, combo CID
mappings, transaction snapshot) from the parallel group leader to the
worker processes.  Third, it prohibits various operations that would
result in unsafe changes to that state while parallelism is active.
Finally, it propagates events that would result in an ErrorResponse,
NoticeResponse, or NotifyResponse message being sent to the client
from the parallel workers back to the master, from which they can then
be sent on to the client.

Robert Haas, Amit Kapila, Noah Misch, Rushabh Lathia, Jeevan Chalke.
Suggestions and review from Andres Freund, Heikki Linnakangas, Noah
Misch, Simon Riggs, Euler Taveira, and Jim Nasby.
2015-04-30 15:02:14 -04:00
Stephen Frost e89bd02f58 Perform RLS WITH CHECK before constraints, etc
The RLS capability is built on top of the WITH CHECK OPTION
system which was added for auto-updatable views, however, unlike
WCOs on views (which are mandated by the SQL spec to not fire until
after all other constraints and checks are done), it makes much more
sense for RLS checks to happen earlier than constraint and uniqueness
checks.

This patch reworks the structure which holds the WCOs a bit to be
explicitly either VIEW or RLS checks and the RLS-related checks are
done prior to the constraint and uniqueness checks.  This also allows
better error reporting as we are now reporting when a violation is due
to a WITH CHECK OPTION and when it's due to an RLS policy violation,
which was independently noted by Craig Ringer as being confusing.

The documentation is also updated to include a paragraph about when RLS
WITH CHECK handling is performed, as there have been a number of
questions regarding that and the documentation was previously silent on
the matter.

Author: Dean Rasheed, with some kabitzing and comment changes by me.
2015-04-24 20:34:26 -04:00
Tom Lane feeb526cfe Fix ExecOpenScanRelation to take a lock on a ROW_MARK_COPY relation.
ExecOpenScanRelation assumed that any relation listed in the ExecRowMark
list has been locked by InitPlan; but this is not true if the rel's
markType is ROW_MARK_COPY, which is possible if it's a foreign table.

In most (possibly all) cases, failure to acquire a lock here isn't really
problematic because the parser, planner, or plancache would have taken the
appropriate lock already.  In principle though it might leave us vulnerable
to working with a relation that we hold no lock on, and in any case if the
executor isn't depending on previously-taken locks otherwise then it should
not do so for ROW_MARK_COPY relations.

Noted by Etsuro Fujita.  Back-patch to all active versions, since the
inconsistency has been there a long time.  (It's almost certainly
irrelevant in 9.0, since that predates foreign tables, but the code's
still wrong on its own terms.)
2015-03-24 15:53:06 -04:00
Tom Lane cb1ca4d800 Allow foreign tables to participate in inheritance.
Foreign tables can now be inheritance children, or parents.  Much of the
system was already ready for this, but we had to fix a few things of
course, mostly in the area of planner and executor handling of row locks.

As side effects of this, allow foreign tables to have NOT VALID CHECK
constraints (and hence to accept ALTER ... VALIDATE CONSTRAINT), and to
accept ALTER SET STORAGE and ALTER SET WITH/WITHOUT OIDS.  Continuing to
disallow these things would've required bizarre and inconsistent special
cases in inheritance behavior.  Since foreign tables don't enforce CHECK
constraints anyway, a NOT VALID one is a complete no-op, but that doesn't
mean we shouldn't allow it.  And it's possible that some FDWs might have
use for SET STORAGE or SET WITH OIDS, though doubtless they will be no-ops
for most.

An additional change in support of this is that when a ModifyTable node
has multiple target tables, they will all now be explicitly identified
in EXPLAIN output, for example:

 Update on pt1  (cost=0.00..321.05 rows=3541 width=46)
   Update on pt1
   Foreign Update on ft1
   Foreign Update on ft2
   Update on child3
   ->  Seq Scan on pt1  (cost=0.00..0.00 rows=1 width=46)
   ->  Foreign Scan on ft1  (cost=100.00..148.03 rows=1170 width=46)
   ->  Foreign Scan on ft2  (cost=100.00..148.03 rows=1170 width=46)
   ->  Seq Scan on child3  (cost=0.00..25.00 rows=1200 width=46)

This was done mainly to provide an unambiguous place to attach "Remote SQL"
fields, but it is useful for inherited updates even when no foreign tables
are involved.

Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro
Horiguchi, some additional hacking by me
2015-03-22 13:53:21 -04:00
Tom Lane 443fd0540e Ensure tableoid reads correctly in EvalPlanQual-manufactured tuples.
The ROW_MARK_COPY path in EvalPlanQualFetchRowMarks() was just setting
tableoid to InvalidOid, I think on the assumption that the referenced
RTE must be a subquery or other case without a meaningful OID.  However,
foreign tables also use this code path, and they do have meaningful
table OIDs; so failure to set the tuple field can lead to user-visible
misbehavior.  Fix that by fetching the appropriate OID from the range
table.

There's still an issue about whether CTID can ever have a meaningful
value in this case; at least with postgres_fdw foreign tables, it does.
But that is a different problem that seems to require a significantly
different patch --- it's debatable whether postgres_fdw really wants to
use this code path at all.

Simplified version of a patch by Etsuro Fujita, who also noted the
problem to begin with.  The issue can be demonstrated in all versions
having FDWs, so back-patch to 9.1.
2015-03-12 13:39:09 -04:00
Heikki Linnakangas 57fe246890 Fix reference-after-free when waiting for another xact due to constraint.
If an insertion or update had to wait for another transaction to finish,
because there was another insertion with conflicting key in progress,
we would pass a just-free'd item pointer to XactLockTableWait().

All calls to XactLockTableWait() and MultiXactIdWait() had similar issues.
Some passed a pointer to a buffer in the buffer cache, after already
releasing the lock. The call in EvalPlanQualFetch had already released the
pin too. All but the call in execUtils.c would merely lead to reporting a
bogus ctid, however (or an assertion failure, if enabled).

All the callers that passed HeapTuple->t_data->t_ctid were slightly bogus
anyway: if the tuple was updated (again) in the same transaction, its ctid
field would point to the next tuple in the chain, not the tuple itself.

Backpatch to 9.4, where the 'ctid' argument to XactLockTableWait was added
(in commit f88d4cfc)
2015-02-04 16:00:34 +02:00
Stephen Frost 804b6b6db4 Fix column-privilege leak in error-message paths
While building error messages to return to the user,
BuildIndexValueDescription, ExecBuildSlotValueDescription and
ri_ReportViolation would happily include the entire key or entire row in
the result returned to the user, even if the user didn't have access to
view all of the columns being included.

Instead, include only those columns which the user is providing or which
the user has select rights on.  If the user does not have any rights
to view the table or any of the columns involved then no detail is
provided and a NULL value is returned from BuildIndexValueDescription
and ExecBuildSlotValueDescription.  Note that, for key cases, the user
must have access to all of the columns for the key to be shown; a
partial key will not be returned.

Further, in master only, do not return any data for cases where row
security is enabled on the relation and row security should be applied
for the user.  This required a bit of refactoring and moving of things
around related to RLS- note the addition of utils/misc/rls.c.

Back-patch all the way, as column-level privileges are now in all
supported versions.

This has been assigned CVE-2014-8161, but since the issue and the patch
have already been publicized on pgsql-hackers, there's no point in trying
to hide this commit.
2015-01-28 12:31:30 -05:00
Robert Haas 1605291b6c Typo fix.
Etsuro Fujita
2015-01-19 11:36:48 -05:00
Tom Lane c480cb9d24 Fix use-of-already-freed-memory problem in EvalPlanQual processing.
Up to now, the "child" executor state trees generated for EvalPlanQual
rechecks have simply shared the ResultRelInfo arrays used for the original
execution tree.  However, this leads to dangling-pointer problems, because
ExecInitModifyTable() is all too willing to scribble on some fields of the
ResultRelInfo(s) even when it's being run in one of those child trees.
This trashes those fields from the perspective of the parent tree, because
even if the generated subtree is logically identical to what was in use in
the parent, it's in a memory context that will go away when we're done
with the child state tree.

We do however want to share information in the direction from the parent
down to the children; in particular, fields such as es_instrument *must*
be shared or we'll lose the stats arising from execution of the children.
So the simplest fix is to make a copy of the parent's ResultRelInfo array,
but not copy any fields back at end of child execution.

Per report from Manuel Kniep.  The added isolation test is based on his
example.  In an unpatched memory-clobber-enabled build it will reliably
fail with "ctid is NULL" errors in all branches back to 9.1, as a
consequence of junkfilter->jf_junkAttNo being overwritten with $7f7f.
This test cannot be run as-is before that for lack of WITH syntax; but
I have no doubt that some variant of this problem can arise in older
branches, so apply the code change all the way back.
2015-01-15 18:52:58 -05:00
Stephen Frost c4fda14845 Fix typo in execMain.c
Wee -> We.

Pointed out by Etsuro Fujita.
2015-01-09 11:07:35 -05:00
Bruce Momjian 4baaf863ec Update copyright for 2015
Backpatch certain files through 9.0
2015-01-06 11:43:47 -05:00
Tom Lane f4e031c662 Add bms_next_member(), and use it where appropriate.
This patch adds a way of iterating through the members of a bitmapset
nondestructively, unlike the old way with bms_first_member().  While
bms_next_member() is very slightly slower than bms_first_member()
(at least for typical-size bitmapsets), eliminating the need to palloc
and pfree a temporary copy of the target bitmapset is a significant win.
So this method should be preferred in all cases where a temporary copy
would be necessary.

Tom Lane, with suggestions from Dean Rasheed and David Rowley
2014-11-28 13:37:25 -05:00
Stephen Frost 143b39c185 Rename pg_rowsecurity -> pg_policy and other fixes
As pointed out by Robert, we should really have named pg_rowsecurity
pg_policy, as the objects stored in that catalog are policies.  This
patch fixes that and updates the column names to start with 'pol' to
match the new catalog name.

The security consideration for COPY with row level security, also
pointed out by Robert, has also been addressed by remembering and
re-checking the OID of the relation initially referenced during COPY
processing, to make sure it hasn't changed under us by the time we
finish planning out the query which has been built.

Robert and Alvaro also commented on missing OCLASS and OBJECT entries
for POLICY (formerly ROWSECURITY or POLICY, depending) in various
places.  This patch fixes that too, which also happens to add the
ability to COMMENT on policies.

In passing, attempt to improve the consistency of messages, comments,
and documentation as well.  This removes various incarnations of
'row-security', 'row-level security', 'Row-security', etc, in favor
of 'policy', 'row level security' or 'row_security' as appropriate.

Happy Thanksgiving!
2014-11-27 01:15:57 -05:00
Alvaro Herrera df630b0dd5 Implement SKIP LOCKED for row-level locks
This clause changes the behavior of SELECT locking clauses in the
presence of locked rows: instead of causing a process to block waiting
for the locks held by other processes (or raise an error, with NOWAIT),
SKIP LOCKED makes the new reader skip over such rows.  While this is not
appropriate behavior for general purposes, there are some cases in which
it is useful, such as queue-like tables.

Catalog version bumped because this patch changes the representation of
stored rules.

Reviewed by Craig Ringer (based on a previous attempt at an
implementation by Simon Riggs, who also provided input on the syntax
used in the current patch), David Rowley, and Álvaro Herrera.

Author: Thomas Munro
2014-10-07 17:23:34 -03:00
Stephen Frost 491c029dbc Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table.  Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.

New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner.  Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.

Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used.  If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.

By default, row security is applied at all times except for the
table owner and the superuser.  A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE.  When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.

Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.

A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.

Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.

Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.

Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 11:18:35 -04:00
Alvaro Herrera 1c9701cfe5 Fix FOR UPDATE NOWAIT on updated tuple chains
If SELECT FOR UPDATE NOWAIT tries to lock a tuple that is concurrently
being updated, it might fail to honor its NOWAIT specification and block
instead of raising an error.

Fix by adding a no-wait flag to EvalPlanQualFetch which it can pass down
to heap_lock_tuple; also use it in EvalPlanQualFetch itself to avoid
blocking while waiting for a concurrent transaction.

Authors: Craig Ringer and Thomas Munro, tweaked by Álvaro
http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com

Per Thomas Munro in the course of his SKIP LOCKED feature submission,
who also provided one of the isolation test specs.

Backpatch to 9.4, because that's as far back as it applies without
conflicts (although the bug goes all the way back).  To that branch also
backpatch Thomas Munro's new NOWAIT test cases, committed in master by
Heikki as commit 9ee16b49f0 .
2014-08-27 19:15:18 -04:00
Bruce Momjian 0a78320057 pgindent run for 9.4
This includes removing tabs after periods in C comments, which was
applied to back branches, so this change should not effect backpatching.
2014-05-06 12:12:18 -04:00
Alvaro Herrera f88d4cfc9d Setup error context callback for transaction lock waits
With this in place, a session blocking behind another one because of
tuple locks will get a context line mentioning the relation name, tuple
TID, and operation being done on tuple.  For example:

LOG:  process 11367 still waiting for ShareLock on transaction 717 after 1000.108 ms
DETAIL:  Process holding the lock: 11366. Wait queue: 11367.
CONTEXT:  while updating tuple (0,2) in relation "foo"
STATEMENT:  UPDATE foo SET value = 3;

Most usefully, the new line is displayed by log entries due to
log_lock_waits, although of course it will be printed by any other log
message as well.

Author: Christian Kruse, some tweaks by Álvaro Herrera
Reviewed-by: Amit Kapila, Andres Freund, Tom Lane, Robert Haas
2014-03-19 15:10:36 -03:00
Tom Lane bf4052faa1 Don't reject ROW_MARK_REFERENCE rowmarks for materialized views.
We should allow this so that matviews can be referenced in UPDATE/DELETE
statements in READ COMMITTED isolation level.  The requirement for that
is that a re-fetch by TID will see the same row version the query saw
earlier, which is true of matviews, so there's no reason for the
restriction.  Per bug #9398.

Michael Paquier, after a suggestion by me
2014-03-06 11:37:02 -05:00
Bruce Momjian 7e04792a1c Update copyright for 2014
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
2014-01-07 16:05:30 -05:00
Tom Lane c28b289bf3 Prevent display of dropped columns in row constraint violation messages.
ExecBuildSlotValueDescription() printed "null" for each dropped column in
a row being complained of by ExecConstraints().  This has some sanity in
terms of the underlying implementation, but is of course pretty surprising
to users.  To fix, we must pass the target relation's descriptor to
ExecBuildSlotValueDescription(), because the slot descriptor it had been
using doesn't get labeled with attisdropped markers.

Per bug #8408 from Maxim Boguk.  Back-patch to 9.2 where the feature of
printing row values in NOT NULL and CHECK constraint violation messages
was introduced.

Michael Paquier and Tom Lane
2013-11-07 14:41:36 -05:00
Kevin Grittner be420fa02e Fix subquery reference to non-populated MV in CMV.
A subquery reference to a matview should be allowed by CREATE
MATERIALIZED VIEW WITH NO DATA, just like a direct reference is.

Per bug report from Laurent Sartran.

Backpatch to 9.3.
2013-11-02 18:38:17 -05:00
Peter Eisentraut 32f7c0ae17 Improve error message when view is not updatable
Avoid using the term "updatable" in confusing ways.  Suggest a trigger
first, before a rule.
2013-08-14 23:02:59 -04:00
Stephen Frost 4cbe3ac3e8 WITH CHECK OPTION support for auto-updatable VIEWs
For simple views which are automatically updatable, this patch allows
the user to specify what level of checking should be done on records
being inserted or updated.  For 'LOCAL CHECK', new tuples are validated
against the conditionals of the view they are being inserted into, while
for 'CASCADED CHECK' the new tuples are validated against the
conditionals for all views involved (from the top down).

This option is part of the SQL specification.

Dean Rasheed, reviewed by Pavel Stehule
2013-07-18 17:10:16 -04:00
Kevin Grittner cc1965a99b Add support for REFRESH MATERIALIZED VIEW CONCURRENTLY.
This allows reads to continue without any blocking while a REFRESH
runs.  The new data appears atomically as part of transaction
commit.

Review questioned the Assert that a matview was not a system
relation.  This will be addressed separately.

Reviewed by Hitoshi Harada, Robert Haas, Andres Freund.
Merged after review with security patch f3ab5d4.
2013-07-16 12:55:44 -05:00
Tom Lane dc3eb56383 Improve updatability checking for views and foreign tables.
Extend the FDW API (which we already changed for 9.3) so that an FDW can
report whether specific foreign tables are insertable/updatable/deletable.
The default assumption continues to be that they're updatable if the
relevant executor callback function is supplied by the FDW, but finer
granularity is now possible.  As a test case, add an "updatable" option to
contrib/postgres_fdw.

This patch also fixes the information_schema views, which previously did
not think that foreign tables were ever updatable, and fixes
view_is_auto_updatable() so that a view on a foreign table can be
auto-updatable.

initdb forced due to changes in information_schema views and the functions
they rely on.  This is a bit unfortunate to do post-beta1, but if we don't
change this now then we'll have another API break for FDWs when we do
change it.

Dean Rasheed, somewhat editorialized on by Tom Lane
2013-06-12 17:53:33 -04:00
Bruce Momjian 9af4159fce pgindent run for release 9.3
This is the first run of the Perl-based pgindent script.  Also update
pgindent instructions.
2013-05-29 16:58:43 -04:00
Tom Lane 5194024d72 Incidental cleanup of matviews code.
Move checking for unscannable matviews into ExecOpenScanRelation, which is
a better place for it first because the open relation is already available
(saving a relcache lookup cycle), and second because this eliminates the
problem of telling the difference between rangetable entries that will or
will not be scanned by the query.  In particular we can get rid of the
not-terribly-well-thought-out-or-implemented isResultRel field that the
initial matviews patch added to RangeTblEntry.

Also get rid of entirely unnecessary scannability check in the rewriter,
and a bogus decision about whether RefreshMatViewStmt requires a parse-time
snapshot.

catversion bump due to removal of a RangeTblEntry field, which changes
stored rules.
2013-04-27 17:48:57 -04:00
Peter Eisentraut cc26ea9fe2 Clean up references to SQL92
In most cases, these were just references to the SQL standard in
general.  In a few cases, a contrast was made between SQL92 and later
standards -- those have been kept unchanged.
2013-04-20 11:04:41 -04:00
Kevin Grittner 52e6e33ab4 Create a distinction between a populated matview and a scannable one.
The intent was that being populated would, long term, be just one
of the conditions which could affect whether a matview was
scannable; being populated should be necessary but not always
sufficient to scan the relation.  Since only CREATE and REFRESH
currently determine the scannability, names and comments
accidentally conflated these concepts, leading to confusion.

Also add missing locking for the SQL function which allows a
test for scannability, and fix a modularity violatiion.

Per complaints from Tom Lane, although its not clear that these
will satisfy his concerns.  Hopefully this will at least better
frame the discussion.
2013-04-09 13:02:49 -05:00
Tom Lane 21734d2fb8 Support writable foreign tables.
This patch adds the core-system infrastructure needed to support updates
on foreign tables, and extends contrib/postgres_fdw to allow updates
against remote Postgres servers.  There's still a great deal of room for
improvement in optimization of remote updates, but at least there's basic
functionality there now.

KaiGai Kohei, reviewed by Alexander Korotkov and Laurenz Albe, and rather
heavily revised by Tom Lane.
2013-03-10 14:16:02 -04:00
Kevin Grittner 3bf3ab8c56 Add a materialized view relations.
A materialized view has a rule just like a view and a heap and
other physical properties like a table.  The rule is only used to
populate the table, references in queries refer to the
materialized data.

This is a minimal implementation, but should still be useful in
many cases.  Currently data is only populated "on demand" by the
CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements.
It is expected that future releases will add incremental updates
with various timings, and that a more refined concept of defining
what is "fresh" data will be developed.  At some point it may even
be possible to have queries use a materialized in place of
references to underlying tables, but that requires the other
above-mentioned features to be working first.

Much of the documentation work by Robert Haas.
Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja
Security review by KaiGai Kohei, with a decision on how best to
implement sepgsql still pending.
2013-03-03 18:23:31 -06:00
Tom Lane 991f3e5ab3 Provide database object names as separate fields in error messages.
This patch addresses the problem that applications currently have to
extract object names from possibly-localized textual error messages,
if they want to know for example which index caused a UNIQUE_VIOLATION
failure.  It adds new error message fields to the wire protocol, which
can carry the name of a table, table column, data type, or constraint
associated with the error.  (Since the protocol spec has always instructed
clients to ignore unrecognized field types, this should not create any
compatibility problem.)

Support for providing these new fields has been added to just a limited set
of error reports (mainly, those in the "integrity constraint violation"
SQLSTATE class), but we will doubtless add them to more calls in future.

Pavel Stehule, reviewed and extensively revised by Peter Geoghegan, with
additional hacking by Tom Lane.
2013-01-29 17:08:26 -05:00
Tom Lane 2ddc600f8f Fix SPI documentation for new handling of ExecutorRun's count parameter.
Since 9.0, the count parameter has only limited the number of tuples
actually returned by the executor.  It doesn't affect the behavior of
INSERT/UPDATE/DELETE unless RETURNING is specified, because without
RETURNING, the ModifyTable plan node doesn't return control to execMain.c
for each tuple.  And we only check the limit at the top level.

While this behavioral change was unintentional at the time, discussion of
bug #6572 led us to the conclusion that we prefer the new behavior anyway,
and so we should just adjust the docs to match rather than change the code.
Accordingly, do that.  Back-patch as far as 9.0 so that the docs match the
code in each branch.
2013-01-24 18:34:00 -05:00
Alvaro Herrera 0ac5ad5134 Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE".  UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.

Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.

The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid.  Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates.  This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed.  pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.

Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header.  This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.

Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)

With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.

As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.

Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane.  There's probably room for several more tests.

There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it.  Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.

This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
	1290721684-sup-3951@alvh.no-ip.org
	1294953201-sup-2099@alvh.no-ip.org
	1320343602-sup-2290@alvh.no-ip.org
	1339690386-sup-8927@alvh.no-ip.org
	4FE5FF020200002500048A3D@gw.wicourts.gov
	4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 12:04:59 -03:00
Bruce Momjian bd61a623ac Update copyrights for 2013
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
2013-01-01 17:15:01 -05:00
Tom Lane a99c42f291 Support automatically-updatable views.
This patch makes "simple" views automatically updatable, without the need
to create either INSTEAD OF triggers or INSTEAD rules.  "Simple" views
are those classified as updatable according to SQL-92 rules.  The rewriter
transforms INSERT/UPDATE/DELETE commands on such views directly into an
equivalent command on the underlying table, which will generally have
noticeably better performance than is possible with either triggers or
user-written rules.  A view that has INSTEAD OF triggers or INSTEAD rules
continues to operate the same as before.

For the moment, security_barrier views are not considered simple.
Also, we do not support WITH CHECK OPTION.  These features may be
added in future.

Dean Rasheed, reviewed by Amit Kapila
2012-12-08 18:26:21 -05:00
Kevin Grittner 6868ed7491 Throw error if expiring tuple is again updated or deleted.
This prevents surprising behavior when a FOR EACH ROW trigger
BEFORE UPDATE or BEFORE DELETE directly or indirectly updates or
deletes the the old row.  Prior to this patch the requested action
on the row could be silently ignored while all triggered actions
based on the occurence of the requested action could be committed.
One example of how this could happen is if the BEFORE DELETE
trigger for a "parent" row deleted "children" which had trigger
functions to update summary or status data on the parent.

This also prevents similar surprising problems if the query has a
volatile function which updates a target row while it is already
being updated.

There are related issues present in FOR UPDATE cursors and READ
COMMITTED queries which are not handled by this patch.  These
issues need further evalution to determine what change, if any, is
needed.

Where the new error messages are generated, in most cases the best
fix will be to move code from the BEFORE trigger to an AFTER
trigger.  Where this is not feasible, the trigger can avoid the
error by re-issuing the triggering statement and returning NULL.

Documentation changes will be submitted in a separate patch.

Kevin Grittner and Tom Lane with input from Florian Pflug and
Robert Haas, based on problems encountered during conversion of
Wisconsin Circuit Court trigger logic to plpgsql triggers.
2012-10-26 14:55:36 -05:00
Alvaro Herrera c219d9b0a5 Split tuple struct defs from htup.h to htup_details.h
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.

I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h.  In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
2012-08-30 16:52:35 -04:00
Bruce Momjian 927d61eeff Run pgindent on 9.2 source tree in preparation for first 9.3
commit-fest.
2012-06-10 15:20:04 -04:00
Tom Lane 9dbf2b7d75 Restructure SELECT INTO's parsetree representation into CreateTableAsStmt.
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements.  The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.

In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs.  Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.

Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.

Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn".  There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.

Andres Freund and Tom Lane
2012-03-19 21:38:12 -04:00
Tom Lane 7c1719bc68 Fix handling of data-modifying CTE subplans in EvalPlanQual.
We can't just skip initializing such subplans, because the referencing CTE
node will expect to find the subplan available when it initializes.  That
in turn means that ExecInitModifyTable must allow the case (which actually
it needed to do anyway, since there's no guarantee that ModifyTable is
exactly at the top of the CTE plan tree).  So move the complaint about not
being allowed in EvalPlanQual mode to execution instead of initialization.
Testing turned up yet another problem, which is that we'd try to
re-initialize the result relation's index list, leading to leaks and
dangling pointers.

Per report from Phil Sorber.  Back-patch to 9.1 where data-modifying CTEs
were introduced.
2012-01-28 17:43:57 -05:00
Robert Haas 1575fbcb79 Prevent adding relations to a concurrently dropped schema.
In the previous coding, it was possible for a relation to be created
via CREATE TABLE, CREATE VIEW, CREATE SEQUENCE, CREATE FOREIGN TABLE,
etc.  in a schema while that schema was meanwhile being concurrently
dropped.  This led to a pg_class entry with an invalid relnamespace
value.  The same problem could occur if a relation was moved using
ALTER .. SET SCHEMA while the target schema was being concurrently
dropped.  This patch prevents both of those scenarios by locking the
schema to which the relation is being added using AccessShareLock,
which conflicts with the AccessExclusiveLock taken by DROP.

As a desirable side effect, this also prevents the use of CREATE OR
REPLACE VIEW to queue for an AccessExclusiveLock on a relation on which
you have no rights: that will now fail immediately with a permissions
error, before trying to obtain a lock.

We need similar protection for all other object types, but as everything
other than relations uses a slightly different set of code paths, I'm
leaving that for a separate commit.

Original complaint (as far as I could find) about CREATE by Nikhil
Sontakke; risk for ALTER .. SET SCHEMA pointed out by Tom Lane;
further details by Dan Farina; patch by me; review by Hitoshi Harada.
2012-01-16 09:49:34 -05:00
Tom Lane dfd26f9c5f Make executor's SELECT INTO code save and restore original tuple receiver.
As previously coded, the QueryDesc's dest pointer was left dangling
(pointing at an already-freed receiver object) after ExecutorEnd.  It's a
bit astonishing that it took us this long to notice, and I'm not sure that
the known problem case with SQL functions is the only one.  Fix it by
saving and restoring the original receiver pointer, which seems the most
bulletproof way of ensuring any related bugs are also covered.

Per bug #6379 from Paul Ramsey.  Back-patch to 8.4 where the current
handling of SELECT INTO was introduced.
2012-01-04 18:30:55 -05:00
Bruce Momjian e126958c2e Update copyright notices for year 2012. 2012-01-01 18:01:58 -05:00
Peter Eisentraut 729205571e Add support for privileges on types
This adds support for the more or less SQL-conforming USAGE privilege
on types and domains.  The intent is to be able restrict which users
can create dependencies on types, which restricts the way in which
owners can alter types.

reviewed by Yeb Havinga
2011-12-20 00:05:19 +02:00
Tom Lane f225e4bc54 When a row fails a not-null constraint, show row's contents in errdetail.
Simple extension of previous patch for CHECK constraints.
2011-11-29 18:29:18 -05:00
Tom Lane f1e13001b2 When a row fails a CHECK constraint, show row's contents in errdetail.
This should make it easier to identify which row is problematic when an
insert or update is processing many rows.

The formatting is similar to that for unique-index violation messages,
except that we limit field widths to 64 bytes since otherwise the message
could get unreasonably long.  (In particular, there's currently no attempt
to quote or escape field values that contain commas etc.)

Jan Kundrát, reviewed by Royce Ausburn, somewhat rewritten by me.
2011-11-29 15:02:49 -05:00
Tom Lane 9ed439a9c0 Fix unsupported options in CREATE TABLE ... AS EXECUTE.
The WITH [NO] DATA option was not supported, nor the ability to specify
replacement column names; the former limitation wasn't even documented, as
per recent complaint from Naoya Anzai.  Fix by moving the responsibility
for supporting these options into the executor.  It actually takes less
code this way ...

catversion bump due to change in representation of IntoClause, which might
affect stored rules.
2011-11-24 23:21:45 -05:00
Robert Haas f1b4aa2a84 Check for INSERT privileges in SELECT INTO / CREATE TABLE AS.
In the normal course of events, this matters only if ALTER DEFAULT
PRIVILEGES has been used to revoke default INSERT permission.  Whether
or not the new behavior is more or less likely to be what the user wants
when dealing only with the built-in privilege facilities is arguable,
but it's clearly better when using a loadable module such as sepgsql
that may use the hook in ExecCheckRTPerms to enforce additional
permissions checks.

KaiGai Kohei, reviewed by Albe Laurenz
2011-11-22 16:16:26 -05:00
Tom Lane f197272365 Make EXPLAIN ANALYZE report the numbers of rows rejected by filter steps.
This provides information about the numbers of tuples that were visited
but not returned by table scans, as well as the numbers of join tuples
that were considered and discarded within a join plan node.

There is still some discussion going on about the best way to report counts
for outer-join situations, but I think most of what's in the patch would
not change if we revise that, so I'm going to go ahead and commit it as-is.

Documentation changes to follow (they weren't in the submitted patch
either).

Marko Tiikkaja, reviewed by Marc Cousin, somewhat revised by Tom
2011-09-22 11:30:11 -04:00
Tom Lane b33f78df17 Fix trigger WHEN conditions when both BEFORE and AFTER triggers exist.
Due to tuple-slot mismanagement, evaluation of WHEN conditions for AFTER
ROW UPDATE triggers could crash if there had been a BEFORE ROW trigger
fired for the same update.  Fix by not trying to overload the use of
estate->es_trig_tuple_slot.  Per report from Yoran Heling.

Back-patch to 9.0, when trigger WHEN conditions were introduced.
2011-08-21 18:15:55 -04:00
Robert Haas 5da79169d3 Fix bugs in relpersistence handling during table creation.
Unlike the relistemp field which it replaced, relpersistence must be
set correctly quite early during the table creation process, as we
rely on it quite early on for a number of purposes, including security
checks.  Normally, this is set based on whether the user enters CREATE
TABLE, CREATE UNLOGGED TABLE, or CREATE TEMPORARY TABLE, but a
relation may also be made implicitly temporary by creating it in
pg_temp.  This patch fixes the handling of that case, and also
disables creation of unlogged tables in temporary tablespace (such
table indeed skip WAL-logging, but we reject an explicit
specification) and creation of relations in the temporary schemas of
other sessions (which is not very sensible, and didn't work right
anyway).

Report by Amit Khandekar.
2011-07-03 17:34:47 -04:00
Alvaro Herrera a40a5d9468 Remove extra copying of TupleDescs for heap_create_with_catalog
Some callers were creating copies of tuple descriptors to pass to that
function, stating in code comments that it was necessary because it
modified the passed descriptor.  Code inspection reveals this not to be
true, and indeed not all callers are passing copies in the first place.
So remove the extra ones and the misleading comments about this behavior
as well.
2011-06-20 10:50:23 -04:00
Tom Lane 21538377ee Disallow SELECT FOR UPDATE/SHARE on sequences.
We can't allow this because such an operation stores its transaction XID
into the sequence tuple's xmax.  Because VACUUM doesn't process sequences
(and we don't want it to start doing so), such an xmax value won't get
frozen, meaning it will eventually refer to nonexistent pg_clog storage,
and even wrap around completely.  Since the row lock is ignored by nextval
and setval, the usefulness of the operation is highly debatable anyway.
Per reports of trouble with pgpool 3.0, which had ill-advisedly started
using such commands as a form of locking.

In HEAD, also disallow SELECT FOR UPDATE/SHARE on toast tables.  Although
this does work safely given the current implementation, there seems no
good reason to allow it.  I refrained from changing that behavior in
back branches, however.
2011-06-02 14:46:15 -04:00
Robert Haas 68ef051f5c Refactor broken CREATE TABLE IF NOT EXISTS support.
Per bug #5988, reported by Marko Tiikkaja, and further analyzed by Tom
Lane, the previous coding was broken in several respects: even if the
target table already existed, a subsequent CREATE TABLE IF NOT EXISTS
might try to add additional constraints or sequences-for-serial
specified in the new CREATE TABLE statement.

In passing, this also fixes a minor information leak: it's no longer
possible to figure out whether a schema to which you don't have CREATE
access contains a sequence named like "x_y_seq" by attempting to create a
table in that schema called "x" with a serial column called "y".

Some more refactoring of this code in the future might be warranted,
but that will need to wait for a later major release.
2011-04-25 16:55:11 -04:00
Bruce Momjian bf50caf105 pgindent run before PG 9.1 beta 1. 2011-04-10 11:42:00 -04:00
Tom Lane a874fe7b4c Refactor the executor's API to support data-modifying CTEs better.
The originally committed patch for modifying CTEs didn't interact well
with EXPLAIN, as noted by myself, and also had corner-case problems with
triggers, as noted by Dean Rasheed.  Those problems show it is really not
practical for ExecutorEnd to call any user-defined code; so split the
cleanup duties out into a new function ExecutorFinish, which must be called
between the last ExecutorRun call and ExecutorEnd.  Some Asserts have been
added to these functions to help verify correct usage.

It is no longer necessary for callers of the executor to call
AfterTriggerBeginQuery/AfterTriggerEndQuery for themselves, as this is now
done by ExecutorStart/ExecutorFinish respectively.  If you really need to
suppress that and do it for yourself, pass EXEC_FLAG_SKIP_TRIGGERS to
ExecutorStart.

Also, refactor portal commit processing to allow for the possibility that
PortalDrop will invoke user-defined code.  I think this is not actually
necessary just yet, since the portal-execution-strategy logic forces any
non-pure-SELECT query to be run to completion before we will consider
committing.  But it seems like good future-proofing.
2011-02-27 13:44:12 -05:00
Tom Lane 389af95155 Support data-modifying commands (INSERT/UPDATE/DELETE) in WITH.
This patch implements data-modifying WITH queries according to the
semantics that the updates all happen with the same command counter value,
and in an unspecified order.  Therefore one WITH clause can't see the
effects of another, nor can the outer query see the effects other than
through the RETURNING values.  And attempts to do conflicting updates will
have unpredictable results.  We'll need to document all that.

This commit just fixes the code; documentation updates are waiting on
author.

Marko Tiikkaja and Hitoshi Harada
2011-02-25 18:58:02 -05:00
Tom Lane bb74240794 Implement an API to let foreign-data wrappers actually be functional.
This commit provides the core code and documentation needed.  A contrib
module test case will follow shortly.

Shigeru Hanada, Jan Urbanski, Heikki Linnakangas
2011-02-20 00:18:14 -05:00
Tom Lane e617f0d7e4 Fix improper matching of resjunk column names for FOR UPDATE in subselect.
Flattening of subquery range tables during setrefs.c could lead to the
rangetable indexes in PlanRowMark nodes not matching up with the column
names previously assigned to the corresponding resjunk ctid (resp. tableoid
or wholerow) columns.  Typical symptom would be either a "cannot extract
system attribute from virtual tuple" error or an Assert failure.  This
wasn't a problem before 9.0 because we didn't support FOR UPDATE below the
top query level, and so the final flattening could never renumber an RTE
that was relevant to FOR UPDATE.  Fix by using a plan-tree-wide unique
number for each PlanRowMark to label the associated resjunk columns, so
that the number need not change during flattening.

Per report from David Johnston (though I'm darned if I can see how this got
past initial testing of the relevant code).  Back-patch to 9.0.
2011-02-09 23:27:42 -05:00
Tom Lane d487afbb81 Fix PlanRowMark/ExecRowMark structures to handle inheritance correctly.
In an inherited UPDATE/DELETE, each target table has its own subplan,
because it might have a column set different from other targets.  This
means that the resjunk columns we add to support EvalPlanQual might be
at different physical column numbers in each subplan.  The EvalPlanQual
rewrite I did for 9.0 failed to account for this, resulting in possible
misbehavior or even crashes during concurrent updates to the same row,
as seen in a recent report from Gordon Shannon.  Revise the data structure
so that we track resjunk column numbers separately for each subplan.

I also chose to move responsibility for identifying the physical column
numbers back to executor startup, instead of assuming that numbers derived
during preprocess_targetlist would stay valid throughout subsequent
massaging of the plan.  That's a bit slower, so we might want to consider
undoing it someday; but it would complicate the patch considerably and
didn't seem justifiable in a bug fix that has to be back-patched to 9.0.
2011-01-12 20:47:02 -05:00
Robert Haas 0d692a0dc9 Basic foreign table support.
Foreign tables are a core component of SQL/MED.  This commit does
not provide a working SQL/MED infrastructure, because foreign tables
cannot yet be queried.  Support for foreign table scans will need to
be added in a future patch.  However, this patch creates the necessary
system catalog structure, syntax support, and support for ancillary
operations such as COMMENT and SECURITY LABEL.

Shigeru Hanada, heavily revised by Robert Haas
2011-01-01 23:48:11 -05:00
Bruce Momjian 5d950e3b0c Stamp copyrights for year 2011. 2011-01-01 13:18:15 -05:00
Robert Haas 5f7b58fad8 Generalize concept of temporary relations to "relation persistence".
This commit replaces pg_class.relistemp with pg_class.relpersistence;
and also modifies the RangeVar node type to carry relpersistence rather
than istemp.  It also removes removes rd_istemp from RelationData and
instead performs the correct computation based on relpersistence.

For clarity, we add three new macros: RelationNeedsWAL(),
RelationUsesLocalBuffers(), and RelationUsesTempNamespace(), so that we
can clarify the purpose of each check that previous depended on
rd_istemp.

This is intended as infrastructure for the upcoming unlogged tables
patch, as well as for future possible work on global temporary tables.
2010-12-13 12:34:26 -05:00
Tom Lane 2ec993a7cb Support triggers on views.
This patch adds the SQL-standard concept of an INSTEAD OF trigger, which
is fired instead of performing a physical insert/update/delete.  The
trigger function is passed the entire old and/or new rows of the view,
and must figure out what to do to the underlying tables to implement
the update.  So this feature can be used to implement updatable views
using trigger programming style rather than rule hacking.

In passing, this patch corrects the names of some columns in the
information_schema.triggers view.  It seems the SQL committee renamed
them somewhere between SQL:99 and SQL:2003.

Dean Rasheed, reviewed by Bernd Helmle; some additional hacking by me.
2010-10-10 13:45:07 -04:00
Magnus Hagander 9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Joe Conway 5eb15c9942 SERIALIZABLE transactions are actually implemented beneath the covers with
transaction snapshots, i.e. a snapshot registered at the beginning of
a transaction. Change variable naming and comments to reflect this reality
in preparation for a future, truly serializable mode, e.g.
Serializable Snapshot Isolation (SSI).

For the moment transaction snapshots are still used to implement
SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin
Grittner and Dan Ports with review and some minor wording changes by me.
2010-09-11 18:38:58 +00:00
Robert Haas 2a6ef3445c Standardize get_whatever_oid functions for object types with
unqualified names.

- Add a missing_ok parameter to get_tablespace_oid.
- Avoid duplicating get_tablespace_od guts in objectNamesToOids.
- Add a missing_ok parameter to get_database_oid.
- Replace get_roleid and get_role_checked with get_role_oid.
- Add get_namespace_oid, get_language_oid, get_am_oid.
- Refactor existing code to use new interfaces.

Thanks to KaiGai Kohei for the review.
2010-08-05 14:45:09 +00:00
Robert Haas a3b012b560 CREATE TABLE IF NOT EXISTS.
Reviewed by Bernd Helmle.
2010-07-25 23:21:22 +00:00
Robert Haas b8c6c71d1c Centralize DML permissions-checking logic.
Remove bespoke code in DoCopy and RI_Initial_Check, which now instead
fabricate call ExecCheckRTPerms with a manufactured RangeTblEntry.
This is intended to make it feasible for an enhanced security provider
to actually make use of ExecutorCheckPerms_hook, but also has the
advantage that RI_Initial_Check can allow use of the fast-path when
column-level but not table-level permissions are present.

KaiGai Kohei.  Reviewed (in an earlier version) by Stephen Frost, and by me.
Some further changes to the comments by me.
2010-07-22 00:47:59 +00:00
Tom Lane 53e757689c Make NestLoop plan nodes pass outer-relation variables into their inner
relation using the general PARAM_EXEC executor parameter mechanism, rather
than the ad-hoc kluge of passing the outer tuple down through ExecReScan.
The previous method was hard to understand and could never be extended to
handle parameters coming from multiple join levels.  This patch doesn't
change the set of possible plans nor have any significant performance effect,
but it's necessary infrastructure for future generalization of the concept
of an inner indexscan plan.

ExecReScan's second parameter is now unused, so it's removed.
2010-07-12 17:01:06 +00:00
Robert Haas f4122a8d50 Add a hook in ExecCheckRTPerms().
This hook allows a loadable module to gain control when table permissions
are checked.  It is expected to be used by an eventual SE-PostgreSQL
implementation, but there are other possible applications as well.  A
sample contrib module can be found in the archives at:

http://archives.postgresql.org/pgsql-hackers/2010-05/msg01095.php

Robert Haas and Stephen Frost
2010-07-09 14:06:01 +00:00
Heikki Linnakangas 9b8a73326e Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.

Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.

Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 16:10:43 +00:00
Bruce Momjian 65e806cba1 pgindent run for 9.0 2010-02-26 02:01:40 +00:00
Tom Lane 05d8a561ff Clean up handling of XactReadOnly and RecoveryInProgress checks.
Add some checks that seem logically necessary, in particular let's make
real sure that HS slave sessions cannot create temp tables.  (If they did
they would think that temp tables belonging to the master's session with
the same BackendId were theirs.  We *must* not allow myTempNamespace to
become set in a slave session.)

Change setval() and nextval() so that they are only allowed on temp sequences
in a read-only transaction.  This seems consistent with what we allow for
table modifications in read-only transactions.  Since an HS slave can't have a
temp sequence, this also provides a nicer cure for the setval PANIC reported
by Erik Rijkers.

Make the error messages more uniform, and have them mention the specific
command being complained of.  This seems worth the trifling amount of extra
code, since people are likely to see such messages a lot more than before.
2010-02-20 21:24:02 +00:00
Tom Lane cbe9d6beb4 Fix up rickety handling of relation-truncation interlocks.
Move rd_targblock, rd_fsm_nblocks, and rd_vm_nblocks from relcache to the smgr
relation entries, so that they will get reset to InvalidBlockNumber whenever
an smgr-level flush happens.  Because we now send smgr invalidation messages
immediately (not at end of transaction) when a relation truncation occurs,
this ensures that other backends will reset their values before they next
access the relation.  We no longer need the unreliable assumption that a
VACUUM that's doing a truncation will hold its AccessExclusive lock until
commit --- in fact, we can intentionally release that lock as soon as we've
completed the truncation.  This patch therefore reverts (most of) Alvaro's
patch of 2009-11-10, as well as my marginal hacking on it yesterday.  We can
also get rid of assorted no-longer-needed relcache flushes, which are far more
expensive than an smgr flush because they kill a lot more state.

In passing this patch fixes smgr_redo's failure to perform visibility-map
truncation, and cleans up some rather dubious assumptions in freespace.c and
visibilitymap.c about when rd_fsm_nblocks and rd_vm_nblocks can be out of
date.
2010-02-09 21:43:30 +00:00
Tom Lane b9b8831ad6 Create a "relation mapping" infrastructure to support changing the relfilenodes
of shared or nailed system catalogs.  This has two key benefits:

* The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs.

* We no longer have to use an unsafe reindex-in-place approach for reindexing
  shared catalogs.

CLUSTER on nailed catalogs now works too, although I left it disabled on
shared catalogs because the resulting pg_index.indisclustered update would
only be visible in one database.

Since reindexing shared system catalogs is now fully transactional and
crash-safe, the former special cases in REINDEX behavior have been removed;
shared catalogs are treated the same as non-shared.

This commit does not do anything about the recently-discussed problem of
deadlocks between VACUUM FULL/CLUSTER on a system catalog and other
concurrent queries; will address that in a separate patch.  As a stopgap,
parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid
such failures during the regression tests.
2010-02-07 20:48:13 +00:00
Heikki Linnakangas 9de778b24b Move the responsibility of writing a "unlogged WAL operation" record from
heap_sync() to the callers, because heap_sync() is sometimes called even
if the operation itself is WAL-logged. This eliminates the bogus unlogged
records from CLUSTER that Simon Riggs reported, patch by Fujii Masao.
2010-02-03 10:01:30 +00:00
Peter Eisentraut e7b3349a8a Type table feature
This adds the CREATE TABLE name OF type command, per SQL standard.
2010-01-28 23:21:13 +00:00
Heikki Linnakangas 40f908bdcd Introduce Streaming Replication.
This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.

Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.

Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.

Fujii Masao, with additional hacking by me
2010-01-15 09:19:10 +00:00
Tom Lane 217dc525c0 Fix oversight in EvalPlanQualFetch: after failing to lock a tuple because
someone else has just updated it, we have to set priorXmax to that tuple's
xmax (ie, the XID of the other xact that updated it) before looping back to
examine the next tuple.  Obviously, the next tuple in the update chain should
have that XID as its xmin, not the same xmin as the preceding tuple that we
had been trying to lock.  The mismatch would cause the EvalPlanQual logic to
decide that the tuple chain ended in a deletion, when actually there was a
live tuple that should have been found.

I inserted this error when recently adding logic to EvalPlanQual to make it
lock tuples before returning them (as opposed to the old method in which the
lock would occur much later, causing a great deal of work to be wasted if we
only then discover someone else updated it).  Sigh.  Per today's report from
Takahiro Itagaki of inconsistent results during pgbench runs.
2010-01-08 02:44:00 +00:00
Bruce Momjian f98fbc78c3 Preserve relfilenodes:
Add support to pg_dump --binary-upgrade to preserve all relfilenodes,
for use by pg_migrator.
2010-01-06 03:04:03 +00:00
Bruce Momjian 0239800893 Update copyright for the year 2010. 2010-01-02 16:58:17 +00:00
Robert Haas cddca5ec13 Add an EXPLAIN (BUFFERS) option to show buffer-usage statistics.
This patch also removes buffer-usage statistics from the track_counts
output, since this (or the global server statistics) is deemed to be a better
interface to this information.

Itagaki Takahiro, reviewed by Euler Taveira de Oliveira.
2009-12-15 04:57:48 +00:00
Tom Lane d8e511fabb Ensure that the result tuple of an EvalPlanQual cycle gets materialized
before we zap the input tuple.  Otherwise, pass-by-reference columns of
the result slot are likely to contain just references to the input
tuple, leading to big trouble if the pfree'd space is reused.  Per
trouble report from Jaime Casanova.  This is a new bug in the recent
rewrite of EvalPlanQual, so nothing to back-patch.
2009-12-11 18:14:43 +00:00
Tom Lane 62aba76568 Prevent indirect security attacks via changing session-local state within
an allegedly immutable index function.  It was previously recognized that
we had to prevent such a function from executing SET/RESET ROLE/SESSION
AUTHORIZATION, or it could trivially obtain the privileges of the session
user.  However, since there is in general no privilege checking for changes
of session-local state, it is also possible for such a function to change
settings in a way that might subvert later operations in the same session.
Examples include changing search_path to cause an unexpected function to
be called, or replacing an existing prepared statement with another one
that will execute a function of the attacker's choosing.

The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against
these threats, which are the same places previously deemed to need protection
against the SET ROLE issue.  GUC changes are still allowed, since there are
many useful cases for that, but we prevent security problems by forcing a
rollback of any GUC change after completing the operation.  Other cases are
handled by throwing an error if any change is attempted; these include temp
table creation, closing a cursor, and creating or deleting a prepared
statement.  (In 7.4, the infrastructure to roll back GUC changes doesn't
exist, so we settle for rejecting changes of "search_path" in these contexts.)

Original report and patch by Gurjeet Singh, additional analysis by
Tom Lane.

Security: CVE-2009-4136
2009-12-09 21:57:51 +00:00
Tom Lane 7fc0f06221 Add a WHEN clause to CREATE TRIGGER, allowing a boolean expression to be
checked to determine whether the trigger should be fired.

For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER
triggers it can provide a noticeable performance improvement, since queuing of
a deferred trigger event and re-fetching of the row(s) at end of statement can
be short-circuited if the trigger does not need to be fired.

Takahiro Itagaki, reviewed by KaiGai Kohei.
2009-11-20 20:38:12 +00:00
Tom Lane 9f2ee8f287 Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases.  We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries.  If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row.  The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.

Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested.  To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param.  Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.

This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE.  This is needed to avoid the
duplicate-output-tuple problem.  It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 02:26:45 +00:00
Tom Lane 0adaf4cb31 Move the handling of SELECT FOR UPDATE locking and rechecking out of
execMain.c and into a new plan node type LockRows.  Like the recent change
to put table updating into a ModifyTable plan node, this increases planning
flexibility by allowing the operations to occur below the top level of the
plan tree.  It's necessary in any case to restore the previous behavior of
having FOR UPDATE locking occur before ModifyTable does.

This partially refactors EvalPlanQual to allow multiple rows-under-test
to be inserted into the EPQ machinery before starting an EPQ test query.
That isn't sufficient to fix EPQ's general bogosity in the face of plans
that return multiple rows per test row, though.  Since this patch is
mostly about getting some plan node infrastructure in place and not about
fixing ten-year-old bugs, I will leave EPQ improvements for another day.

Another behavioral change that we could now think about is doing FOR UPDATE
before LIMIT, but that too seems like it should be treated as a followon
patch.
2009-10-12 18:10:51 +00:00
Tom Lane 8a5849b7ff Split the processing of INSERT/UPDATE/DELETE operations out of execMain.c.
They are now handled by a new plan node type called ModifyTable, which is
placed at the top of the plan tree.  In itself this change doesn't do much,
except perhaps make the handling of RETURNING lists and inherited UPDATEs a
tad less klugy.  But it is necessary preparation for the intended extension of
allowing RETURNING queries inside WITH.

Marko Tiikkaja
2009-10-10 01:43:50 +00:00
Tom Lane c970292a94 Remove very ancient tuple-counting infrastructure (IncrRetrieved() and
friends).  This code has all been ifdef'd out for many years, and doesn't
seem to have any prospect of becoming any more useful in the future.
EXPLAIN ANALYZE is what people use in practice, and I think if we did want
process-wide counters we'd be more likely to put in dtrace events for that
than try to resurrect this code.  Get rid of it so as to have one less detail
to worry about while refactoring execMain.c.
2009-10-08 22:34:57 +00:00
Tom Lane 249724cb01 Create an ALTER DEFAULT PRIVILEGES command, which allows users to adjust
the privileges that will be applied to subsequently-created objects.

Such adjustments are always per owning role, and can be restricted to objects
created in particular schemas too.  A notable benefit is that users can
override the traditional default privilege settings, eg, the PUBLIC EXECUTE
privilege traditionally granted by default for functions.

Petr Jelinek
2009-10-05 19:24:49 +00:00
Tom Lane f92e8a4b5e Replace the array-style TupleTable data structure with a simple List of
TupleTableSlot nodes.  This eliminates the need to count in advance
how many Slots will be needed, which seems more than worth the small
increase in the amount of palloc traffic during executor startup.

The ExecCountSlots infrastructure is now all dead code, but I'll remove it
in a separate commit for clarity.

Per a comment from Robert Haas.
2009-09-27 20:09:58 +00:00
Tom Lane 4985635230 Extend the BKI infrastructure to allow system catalogs to be given
hand-assigned rowtype OIDs, even when they are not "bootstrapped" catalogs
that have handmade type rows in pg_type.h.  Give pg_database such an OID.
Restore the availability of C macros for the rowtype OIDs of the bootstrapped
catalogs.  (These macros are now in the individual catalogs' .h files,
though, not in pg_type.h.)

This commit doesn't do anything especially useful by itself, but it's
necessary infrastructure for reverting some ill-considered changes in
relcache.c.
2009-09-26 22:42:03 +00:00
Tom Lane 25d9bf2e3e Support deferrable uniqueness constraints.
The current implementation fires an AFTER ROW trigger for each tuple that
looks like it might be non-unique according to the index contents at the
time of insertion.  This works well as long as there aren't many conflicts,
but won't scale to massive unique-key reassignments.  Improving that case
is a TODO item.

Dean Rasheed
2009-07-29 20:56:21 +00:00
Tom Lane 44aa60fa7c Revisit AlterTableCreateToastTable's API once again, hoping to make it what
pg_migrator actually needs and not just a partial solution.  We have to be
able to specify the OID that the new toast table should be created with.
2009-06-11 20:46:11 +00:00
Bruce Momjian d747140279 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list
provided by Andrew.
2009-06-11 14:49:15 +00:00
Tom Lane 1e06ed1abe Add an option to AlterTableCreateToastTable() to allow its caller to force
a toast table to be built, even if the sum-of-column-widths calculation
indicates one isn't needed.  This is needed by pg_migrator because if the
old table has a toast table, we have to migrate over the toast table since
it might contain some live data, even though subsequent column drops could
mean that no recently-added rows could require toasting.
2009-05-07 22:58:28 +00:00
Tom Lane 3d02cae310 Ensure that INSERT ... SELECT into a table with OIDs never copies row OIDs
from the source table.  This could never happen anyway before 8.4 because
the executor invariably applied a "junk filter" to rows due to be inserted;
but now that we skip doing that when it's not necessary, the case can occur.
Problem noted 2008-11-27 by KaiGai Kohei, though I misunderstood what he
was on about at the time (the opacity of the patch he proposed didn't help).
2009-02-08 18:02:27 +00:00
Alvaro Herrera 3a5b773715 Allow reloption names to have qualifiers, initially supporting a TOAST
qualifier, and add support for this in pg_dump.

This allows TOAST tables to have user-defined fillfactor, and will also
enable us to move the autovacuum parameters to reloptions without taking
away the possibility of setting values for TOAST tables.
2009-02-02 19:31:40 +00:00
Tom Lane 3cb5d6580a Support column-level privileges, as required by SQL standard.
Stephen Frost, with help from KaiGai Kohei and others
2009-01-22 20:16:10 +00:00
Bruce Momjian 511db38ace Update copyright for 2009. 2009-01-01 17:24:05 +00:00
Tom Lane c1f3073333 Clean up the API for DestReceiver objects by eliminating the assumption
that a Portal is a useful and sufficient additional argument for
CreateDestReceiver --- it just isn't, in most cases.  Instead formalize
the approach of passing any needed parameters to the receiver separately.

One unexpected benefit of this change is that we can declare typedef Portal
in a less surprising location.

This patch is just code rearrangement and doesn't change any functionality.
I'll tackle the HOLD-cursor-vs-toast problem in a follow-on patch.
2008-11-30 20:51:25 +00:00
Tom Lane cd35e9d746 Some infrastructure changes for the upcoming auto-explain contrib module:
* Refactor explain.c slightly to export a convenient-to-use subroutine
for printing EXPLAIN results.

* Provide hooks for plugins to get control at ExecutorStart and ExecutorEnd
as well as ExecutorRun.

* Add some minimal support for tracking the total runtime of ExecutorRun.
This code won't actually do anything unless a plugin prods it to.

* Change the API of the DefineCustomXXXVariable functions to allow nonzero
"flags" to be specified for a custom GUC variable.  While at it, also make
the "bootstrap" default value for custom GUCs be explicitly specified as a
parameter to these functions.  This is to eliminate confusion over where the
default comes from, as has been expressed in the past by some users of the
custom-variable facility.

* Refactor GUC code a bit to ensure that a custom variable gets initialized to
something valid (like its default value) even if the placeholder value was
invalid.
2008-11-19 01:10:24 +00:00
Tom Lane 18004101ac Modify UPDATE/DELETE WHERE CURRENT OF to use the FOR UPDATE infrastructure to
locate the target row, if the cursor was declared with FOR UPDATE or FOR
SHARE.  This approach is more flexible and reliable than digging through the
plan tree; for instance it can cope with join cursors.  But we still provide
the old code for use with non-FOR-UPDATE cursors.  Per gripe from Robert Haas.
2008-11-16 17:34:28 +00:00
Tom Lane 0656ed3daa Make SELECT FOR UPDATE/SHARE work on inheritance trees, by having the plan
return the tableoid as well as the ctid for any FOR UPDATE targets that
have child tables.  All child tables are listed in the ExecRowMark list,
but the executor just skips the ones that didn't produce the current row.

Curiously, this longstanding restriction doesn't seem to have been documented
anywhere; so no doc changes.
2008-11-15 19:43:47 +00:00
Tom Lane 85e2cedf98 Improve bulk-insert performance by keeping the current target buffer pinned
(but not locked, as that would risk deadlocks).  Also, make it work in a small
ring of buffers to avoid having bulk inserts trash the whole buffer arena.

Robert Haas, after an idea of Simon Riggs'.
2008-11-06 20:51:15 +00:00
Tom Lane df5a99612d Simplify ExecutorRun's API and save some trivial number of cycles by having
it just return void instead of sometimes returning a TupleTableSlot.  SQL
functions don't need that anymore, and noplace else does either.  Eliminating
the return value also means one less hassle for the ExecutorRun hook functions
that will be supported beginning in 8.4.
2008-10-31 21:07:55 +00:00
Tom Lane e5536e77a5 Move exprType(), exprTypmod(), expression_tree_walker(), and related routines
into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside
the backend.  There's probably more that should be done along this line,
but this is a start anyway.
2008-08-25 22:42:34 +00:00
Tom Lane 30fd8ec799 Install checks in executor startup to ensure that the tuples produced by an
INSERT or UPDATE will match the target table's current rowtype.  In pre-8.3
releases inconsistency can arise with stale cached plans, as reported by
Merlin Moncure.  (We patched the equivalent hazard on the SELECT side in Feb
2007; I'm not sure why we thought there was no risk on the insertion side.)
In 8.3 and HEAD this problem should be impossible due to plan cache
invalidation management, but it seems prudent to make the check anyway.

Back-patch as far as 8.0.  7.x versions lack ALTER COLUMN TYPE, so there
seems no way to abuse a stale plan comparably.
2008-08-08 17:01:11 +00:00
Tom Lane a77eaa6a95 As noted by Andrew Gierth, there's really no need any more to force a junk
filter to be used when INSERT or SELECT INTO has a plan that returns raw
disk tuples.  The virtual-tuple-slot optimizations that were put in place
awhile ago mean that ExecInsert has to do ExecMaterializeSlot, and that
already copies the tuple if it's raw (and does so more efficiently than
a junk filter, too).  So get rid of that logic.  This in turn means that
we can throw away ExecMayReturnRawTuples, which wasn't used for any other
purpose, and was always a kluge anyway.

In passing, move a couple of SELECT-INTO-specific fields out of EState
and into the private state of the SELECT INTO DestReceiver, as was foreseen
in an old comment there.  Also make intorel_receive use ExecMaterializeSlot
not ExecCopySlotTuple, for consistency with ExecInsert and to possibly save
a tuple copy step in some cases.
2008-07-26 19:15:35 +00:00
Tom Lane 6cc88f0af5 Provide a function hook to let plug-ins get control around ExecutorRun.
ITAGAKI Takahiro
2008-07-18 18:23:47 +00:00
Alvaro Herrera 5da9da71c4 Improve snapshot manager by keeping explicit track of snapshots.
There are two ways to track a snapshot: there's the "registered" list, which
is used for arbitrary long-lived snapshots; and there's the "active stack",
which is used for the snapshot that is considered "active" at any time.
This also allows users of snapshots to stop worrying about snapshot memory
allocation and freeing, and about using PG_TRY blocks around ActiveSnapshot
assignment.  This is all done automatically now.

As a consequence, this allows us to reset MyProc->xmin when there are no
more snapshots registered in the current backend, reducing the impact that
long-running transactions have on VACUUM.
2008-05-12 20:02:02 +00:00
Alvaro Herrera f8c4d7db60 Restructure some header files a bit, in particular heapam.h, by removing some
unnecessary #include lines in it.  Also, move some tuple routine prototypes and
macros to htup.h, which allows removal of heapam.h inclusion from some .c
files.

For this to work, a new header file access/sysattr.h needed to be created,
initially containing attribute numbers of system columns, for pg_dump usage.

While at it, make contrib ltree, intarray and hstore header files more
consistent with our header style.
2008-05-12 00:00:54 +00:00
Tom Lane cd902b331d Change the rules for inherited CHECK constraints to be essentially the same
as those for inherited columns; that is, it's no longer allowed for a child
table to not have a check constraint matching one that exists on a parent.
This satisfies the principle of least surprise (rows selected from the parent
will always appear to meet its check constraints) and eliminates some
longstanding bogosity in pg_dump, which formerly had to guess about whether
check constraints were really inherited or not.

The implementation involves adding conislocal and coninhcount columns to
pg_constraint (paralleling attislocal and attinhcount in pg_attribute)
and refactoring various ALTER TABLE actions to be more like those for
columns.

Alex Hunsaker, Nikhil Sontakke, Tom Lane
2008-05-09 23:32:05 +00:00
Tom Lane f593f62336 Fix a couple of places in execMain that erroneously assumed that SELECT FOR
UPDATE/SHARE couldn't occur as a subquery in a query with a non-SELECT
top-level operation.  Symptoms included outright failure (as in report from
Mark Mielke) and silently neglecting to take the requested row locks.

Back-patch to 8.3, because the visible failure in the INSERT ... SELECT case
is a regression from 8.2.  I'm a bit hesitant to back-patch further given the
lack of field complaints.
2008-04-21 03:49:45 +00:00
Tom Lane 7692d8d5b7 Support statement-level ON TRUNCATE triggers. Simon Riggs 2008-03-28 00:21:56 +00:00
Alvaro Herrera 73b0300b2a Move the HTSU_Result enum definition into snapshot.h, to avoid including
tqual.h into heapam.h.  This makes all inclusion of tqual.h explicit.

I also sorted alphabetically the includes on some source files.
2008-03-26 21:10:39 +00:00
Tom Lane b7fe5f70d3 Fix CREATE TABLE ... LIKE ... INCLUDING INDEXES to not cause unwanted
tablespace permissions failures when copying an index that is in the
database's default tablespace.  A side-effect of the change is that explicitly
specifying the default tablespace no longer triggers a permissions check;
this is not how it was done in pre-8.3 releases but is argued to be more
consistent.  Per bug #3921 from Andrew Gilligan.  (Note: I argued in the
subsequent discussion that maybe LIKE shouldn't copy index tablespaces
at all, but since no one indicated agreement with that idea, I've refrained
from doing it.)
2008-02-07 17:09:51 +00:00
Bruce Momjian 9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
Tom Lane 895a94de6d Avoid incrementing the CommandCounter when CommandCounterIncrement is called
but no database changes have been made since the last CommandCounterIncrement.
This should result in a significant improvement in the number of "commands"
that can typically be performed within a transaction before hitting the 2^32
CommandId size limit.  In particular this buys back (and more) the possible
adverse consequences of my previous patch to fix plan caching behavior.

The implementation requires tracking whether the current CommandCounter
value has been "used" to mark any tuples.  CommandCounter values stored into
snapshots are presumed not to be used for this purpose.  This requires some
small executor changes, since the executor used to conflate the curcid of
the snapshot it was using with the command ID to mark output tuples with.
Separating these concepts allows some small simplifications in executor APIs.

Something for the TODO list: look into having CommandCounterIncrement not do
AcceptInvalidationMessages.  It seems fairly bogus to be doing it there,
but exactly where to do it instead isn't clear, and I'm disinclined to mess
with asynchronous behavior during late beta.
2007-11-30 21:22:54 +00:00
Bruce Momjian f6e8730d11 Re-run pgindent with updated list of typedefs. (Updated README should
avoid this problem in the future.)
2007-11-15 22:25:18 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane 282d2a03dd HOT updates. When we update a tuple without changing any of its indexed
columns, and the new version can be stored on the same heap page, we no longer
generate extra index entries for the new version.  Instead, index searches
follow the HOT-chain links to ensure they find the correct tuple version.

In addition, this patch introduces the ability to "prune" dead tuples on a
per-page basis, without having to do a complete VACUUM pass to recover space.
VACUUM is still needed to clean up dead index entries, however.

Pavan Deolasee, with help from a bunch of other people.
2007-09-20 17:56:33 +00:00
Tom Lane 0a51e7073c Don't take ProcArrayLock while exiting a transaction that has no XID; there is
no need for serialization against snapshot-taking because the xact doesn't
affect anyone else's snapshot anyway.  Per discussion.  Also, move various
info about the interlocking of transactions and snapshots out of code comments
and into a hopefully-more-cohesive discussion in access/transam/README.

Also, remove a couple of now-obsolete comments about having to force some WAL
to be written to persuade RecordTransactionCommit to do its thing.
2007-09-07 20:59:26 +00:00
Tom Lane 817946bb04 Arrange to cache a ResultRelInfo in the executor's EState for relations that
are not one of the query's defined result relations, but nonetheless have
triggers fired against them while the query is active.  This was formerly
impossible but can now occur because of my recent patch to fix the firing
order for RI triggers.  Caching a ResultRelInfo avoids duplicating work by
repeatedly opening and closing the same relation, and also allows EXPLAIN
ANALYZE to "see" and report on these extra triggers.  Use the same mechanism
to cache open relations when firing deferred triggers at transaction shutdown;
this replaces the former one-element-cache strategy used in that case, and
should improve performance a bit when there are deferred triggers on a number
of relations.
2007-08-15 21:39:50 +00:00
Tom Lane 6808f1b1de Support UPDATE/DELETE WHERE CURRENT OF cursor_name, per SQL standard.
Along the way, allow FOR UPDATE in non-WITH-HOLD cursors; there may once
have been a reason to disallow that, but it seems to work now, and it's
really rather necessary if you want to select a row via a cursor and then
update it in a concurrent-safe fashion.

Original patch by Arul Shaji, rather heavily editorialized by Tom Lane.
2007-06-11 01:16:30 +00:00
Tom Lane acfce502ba Create a GUC parameter temp_tablespaces that allows selection of the
tablespace(s) in which to store temp tables and temporary files.  This is a
list to allow spreading the load across multiple tablespaces (a random list
element is chosen each time a temp object is to be created).  Temp files are
not stored in per-database pgsql_tmp/ directories anymore, but per-tablespace
directories.

Jaime Casanova and Albert Cervera, with review by Bernd Helmle and Tom Lane.
2007-06-03 17:08:34 +00:00
Tom Lane bbbe825f5f Modify processing of DECLARE CURSOR and EXPLAIN so that they can resolve the
types of unspecified parameters when submitted via extended query protocol.
This worked in 8.2 but I had broken it during plancache changes.  DECLARE
CURSOR is now treated almost exactly like a plain SELECT through parse
analysis, rewrite, and planning; only just before sending to the executor
do we divert it away to ProcessUtility.  This requires a special-case check
in a number of places, but practically all of them were already special-casing
SELECT INTO, so it's not too ugly.  (Maybe it would be a good idea to merge
the two by treating IntoClause as a form of utility statement?  Not going to
worry about that now, though.)  That approach doesn't work for EXPLAIN,
however, so for that I punted and used a klugy solution of running parse
analysis an extra time if under extended query protocol.
2007-04-27 22:05:49 +00:00
Tom Lane fba8113c1b Teach CLUSTER to skip writing WAL if not needed (ie, not using archiving)
--- Simon.
Also, code review and cleanup for the previous COPY-no-WAL patches --- Tom.
2007-03-29 00:15:39 +00:00
Tom Lane e85a01df67 Clean up the representation of special snapshots by including a "method
pointer" in every Snapshot struct.  This allows removal of the case-by-case
tests in HeapTupleSatisfiesVisibility, which should make it a bit faster
(I didn't try any performance tests though).  More importantly, we are no
longer violating portable C practices by assuming that small integers are
distinct from all pointer values, and HeapTupleSatisfiesDirty no longer
has a non-reentrant API involving side-effects on a global variable.

There were a couple of places calling HeapTupleSatisfiesXXX routines
directly rather than through the HeapTupleSatisfiesVisibility macro.
Since these places had to be changed anyway, I chose to make them go
through the macro for uniformity.

Along the way I renamed HeapTupleSatisfiesSnapshot to HeapTupleSatisfiesMVCC
to emphasize that it's only used with MVCC-type snapshots.  I was sorely
tempted to rename HeapTupleSatisfiesVisibility to HeapTupleSatisfiesSnapshot,
but forebore for the moment to avoid confusion and reduce the likelihood that
this patch breaks some of the pending patches.  Might want to reconsider
doing that later.
2007-03-25 19:45:14 +00:00
Bruce Momjian a535cdf130 Revert temp_tablespaces because of coding problems, per Tom. 2007-03-06 02:06:15 +00:00
Tom Lane c7ff7663e4 Get rid of the separate EState for subplans, and just let them share the
parent query's EState.  Now that there's a single flat rangetable for both
the main plan and subplans, there's no need anymore for a separate EState,
and removing it allows cleaning up some crufty code in nodeSubplan.c and
nodeSubqueryscan.c.  Should be a tad faster too, although any difference
will probably be hard to measure.  This is the last bit of subsidiary
mop-up work from changing to a flat rangetable.
2007-02-27 01:11:26 +00:00
Tom Lane eab6b8b27e Turn the rangetable used by the executor into a flat list, and avoid storing
useless substructure for its RangeTblEntry nodes.  (I chose to keep using the
same struct node type and just zero out the link fields for unneeded info,
rather than making a separate ExecRangeTblEntry type --- it seemed too
fragile to have two different rangetable representations.)

Along the way, put subplans into a list in the toplevel PlannedStmt node,
and have SubPlan nodes refer to them by list index instead of direct pointers.
Vadim wanted to do that years ago, but I never understood what he was on about
until now.  It makes things a *whole* lot more robust, because we can stop
worrying about duplicate processing of subplans during expression tree
traversals.  That's been a constant source of bugs, and it's finally gone.

There are some consequent simplifications yet to be made, like not using
a separate EState for subplans in the executor, but I'll tackle that later.
2007-02-22 22:00:26 +00:00
Tom Lane 9cbd0c155d Remove the Query structure from the executor's API. This allows us to stop
storing mostly-redundant Query trees in prepared statements, portals, etc.
To replace Query, a new node type called PlannedStmt is inserted by the
planner at the top of a completed plan tree; this carries just the fields of
Query that are still needed at runtime.  The statement lists kept in portals
etc. now consist of intermixed PlannedStmt and bare utility-statement nodes
--- no Query.  This incidentally allows us to remove some fields from Query
and Plan nodes that shouldn't have been there in the first place.

Still to do: simplify the execution-time range table; at the moment the
range table passed to the executor still contains Query trees for subqueries.

initdb forced due to change of stored rules.
2007-02-20 17:32:18 +00:00
Tom Lane 5413eef8dc Repair failure to check that a table is still compatible with a previously
made query plan.  Use of ALTER COLUMN TYPE creates a hazard for cached
query plans: they could contain Vars that claim a column has a different
type than it now has.  Fix this by checking during plan startup that Vars
at relation scan level match the current relation tuple descriptor.  Since
at that point we already have at least AccessShareLock, we can be sure the
column type will not change underneath us later in the query.  However,
since a backend's locks do not conflict against itself, there is still a
hole for an attacker to exploit: he could try to execute ALTER COLUMN TYPE
while a query is in progress in the current backend.  Seal that hole by
rejecting ALTER TABLE whenever the target relation is already open in
the current backend.

This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see.  Our thanks to Jeff Trout for the initial report.

Security: CVE-2007-0556
2007-02-02 00:07:03 +00:00
Bruce Momjian 148ea5cbea Add GUC temp_tablespaces to provide a default location for temporary
objects.

Jaime Casanova
2007-01-25 04:35:11 +00:00
Bruce Momjian ef65f6f7a4 Prevent WAL logging when COPY is done in the same transation that
created it.

Simon Riggs
2007-01-25 02:17:26 +00:00
Bruce Momjian 29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane 0cbc5b1ed4 Fix failure due to accessing an already-freed tuple descriptor in a plan
involving HashAggregate over SubqueryScan (this is the known case, there
may well be more).  The bug is only latent in releases before 8.2 since they
didn't try to access tupletable slots' descriptors during ExecDropTupleTable.
The least bogus fix seems to be to make subqueries share the parent query's
memory context, so that tupdescs they create will have the same lifespan as
those of the parent query.  There are comments in the code envisioning going
even further by not having a separate child EState at all, but that will
require rethinking executor access to range tables, which I don't want to
tackle right now.  Per bug report from Jean-Pierre Pelletier.
2006-12-26 21:37:20 +00:00
Tom Lane 8dcc8e3761 Refactor ExecGetJunkAttribute to avoid searching for junk attributes
by name on each and every row processed.  Profiling suggests this may
buy a percent or two for simple UPDATE scenarios, which isn't huge,
but when it's so easy to get ...
2006-12-04 02:06:55 +00:00
Bruce Momjian f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Tom Lane 3f8db37c2f Tweak SPI_cursor_open to allow INSERT/UPDATE/DELETE RETURNING; this was
merely a matter of fixing the error check, since the underlying Portal
infrastructure already handles it.  This in turn allows these statements
to be used in some existing plpgsql and plperl contexts, such as a
plpgsql FOR loop.  Also, do some marginal code cleanup in places that
were being sloppy about distinguishing SELECT from SELECT INTO.
2006-08-12 20:05:56 +00:00
Tom Lane 7a3e30e608 Add INSERT/UPDATE/DELETE RETURNING, with basic docs and regression tests.
plpgsql support to come later.  Along the way, convert execMain's
SELECT INTO support into a DestReceiver, in order to eliminate some ugly
special cases.

Jonah Harris and Tom Lane
2006-08-12 02:52:06 +00:00