postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	dbca84f04e	In rebuild_relation(), don't access an already-closed relcache entry. This reliably fails with -DRELCACHE_FORCE_RELEASE, as reported by Andrew Dunstan, and could sometimes fail in normal operation, resulting in a wrong persistence value being used for the transient table. It's not immediately clear to me what effects that might have beyond the risk of a crash while accessing OldHeap->rd_rel->relpersistence, but it's probably not good. Bug introduced by commit `f41872d0c`, and made substantially worse by commit `85b506bbf`, which added a second such access significantly later than the heap_close. I doubt the first reference could fail in a production scenario, but the second one definitely could. Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com	2017-03-04 16:09:33 -05:00
Tom Lane	aedd554f84	Fix CatalogTupleInsert/Update abstraction for case of shared indstate. Add CatalogTupleInsertWithInfo and CatalogTupleUpdateWithInfo to let callers use the CatalogTupleXXX abstraction layer even in cases where we want to share the results of CatalogOpenIndexes across multiple inserts/updates for efficiency. This finishes the job begun in commit `2f5c9d9c9`, by allowing some remaining simple_heap_insert/update calls to be replaced. The abstraction layer is now complete enough that we don't have to export CatalogIndexInsert at all anymore. Also, this fixes several places in which `2f5c9d9c9` introduced performance regressions by using retail CatalogTupleInsert or CatalogTupleUpdate even though the previous coding had been able to amortize CatalogOpenIndexes work across multiple tuples. A possible future improvement is to arrange for the indexing.c functions to cache the CatalogIndexState somewhere, maybe in the relcache, in which case we could get rid of CatalogTupleInsertWithInfo and CatalogTupleUpdateWithInfo again. But that's a task for another day. Discussion: https://postgr.es/m/27502.1485981379@sss.pgh.pa.us	2017-02-01 17:18:36 -05:00
Alvaro Herrera	2f5c9d9c9c	Tweak catalog indexing abstraction for upcoming WARM Split the existing CatalogUpdateIndexes into two different routines, CatalogTupleInsert and CatalogTupleUpdate, which do both the heap insert/update plus the index update. This removes over 300 lines of boilerplate code all over src/backend/catalog/ and src/backend/commands. The resulting code is much more pleasing to the eye. Also, by encapsulating what happens in detail during an UPDATE, this facilitates the upcoming WARM patch, which is going to add a few more lines to the update case making the boilerplate even more boring. The original CatalogUpdateIndexes is removed; there was only one use left, and since it's just three lines, we can as well expand it in place there. We could keep it, but WARM is going to break all the UPDATE out-of-core callsites anyway, so there seems to be no benefit in doing so. Author: Pavan Deolasee Discussion: https://www.postgr.es/m/CABOikdOcFYSZ4vA2gYfs=M2cdXzXX4qGHeEiW3fu9PCfkHLa2A@mail.gmail.com	2017-01-31 18:42:24 -03:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Robert Haas	3856cf9607	Remove should_free arguments to tuplesort routines. Since commit `e94568ecc1`, the answer is always "false", and we do not need to complicate the API by arranging to return a constant value. Peter Geoghegan Discussion: http://postgr.es/m/CAM3SWZQWZZ_N=DmmL7tKy_OUjGH_5mN=N=A6h7kHyyDvEhg2DA@mail.gmail.com	2016-12-12 15:57:35 -05:00
Tom Lane	ea268cdc9a	Add macros to make AllocSetContextCreate() calls simpler and safer. I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>	2016-08-27 17:50:38 -04:00
Greg Stark	e1623c3959	Fix various common mispellings. Mostly these are just comments but there are a few in documentation and a handful in code and tests. Hopefully this doesn't cause too much unnecessary pain for backpatching. I relented from some of the most common like "thru" for that reason. The rest don't seem numerous enough to cause problems. Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings	2016-06-03 16:08:45 +01:00
Tom Lane	65c5fcd353	Restructure index access method API to hide most of it at the C level. This patch reduces pg_am to just two columns, a name and a handler function. All the data formerly obtained from pg_am is now provided in a C struct returned by the handler function. This is similar to the designs we've adopted for FDWs and tablesample methods. There are multiple advantages. For one, the index AM's support functions are now simple C functions, making them faster to call and much less error-prone, since the C compiler can now check function signatures. For another, this will make it far more practical to define index access methods in installable extensions. A disadvantage is that SQL-level code can no longer see attributes of index AMs; in particular, some of the crosschecks in the opr_sanity regression test are no longer possible from SQL. We've addressed that by adding a facility for the index AM to perform such checks instead. (Much more could be done in that line, but for now we're content if the amvalidate functions more or less replace what opr_sanity used to do.) We might also want to expose some sort of reporting functionality, but this patch doesn't do that. Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily editorialized on by me.	2016-01-17 19:36:59 -05:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Fujii Masao	ecd222e770	Support VERBOSE option in REINDEX command. When this option is specified, a progress report is printed as each index is reindexed. Per discussion, we agreed on the following syntax for the extensibility of the options. REINDEX (flexible options) { INDEX \| ... } name Sawada Masahiko. Reviewed by Robert Haas, Fabrízio Mello, Alvaro Herrera, Kyotaro Horiguchi, Jim Nasby and me. Discussion: CAD21AoA0pK3YcOZAFzMae+2fcc3oGp5zoRggDyMNg5zoaWDhdQ@mail.gmail.com	2015-05-15 20:09:57 +09:00
Alvaro Herrera	a2e35b53c3	Change many routines to return ObjectAddress rather than OID The changed routines are mostly those that can be directly called by ProcessUtilitySlow; the intention is to make the affected object information more precise, in support for future event trigger changes. Originally it was envisioned that the OID of the affected object would be enough, and in most cases that is correct, but upon actually implementing the event trigger changes it turned out that ObjectAddress is more widely useful. Additionally, some command execution routines grew an output argument that's an object address which provides further info about the executed command. To wit: * for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of the new constraint * for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the schema that originally contained the object. * for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address of the object added to or dropped from the extension. There's no user-visible change in this commit, and no functional change either. Discussion: 20150218213255.GC6717@tamriel.snowman.net Reviewed-By: Stephen Frost, Andres Freund	2015-03-03 14:10:50 -03:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Alvaro Herrera	85b506bbfc	Get rid of SET LOGGED indexes persistence kludge This removes ATChangeIndexesPersistence() introduced by `f41872d0c1` which was too ugly to live for long. Instead, the correct persistence marking is passed all the way down to reindex_index, so that the transient relation built to contain the index relfilenode can get marked correctly right from the start. Author: Fabrízio de Royes Mello Review and editorialization by Michael Paquier and Álvaro Herrera	2014-11-15 01:19:49 -03:00
Heikki Linnakangas	2076db2aea	Move the backup-block logic from XLogInsert to a new file, xloginsert.c. xlog.c is huge, this makes it a little bit smaller, which is nice. Functions related to putting together the WAL record are in xloginsert.c, and the lower level stuff for managing WAL buffers and such are in xlog.c. Also move the definition of XLogRecord to a separate header file. This causes churn in the #includes of all the files that write WAL records, and redo routines, but it avoids pulling in xlog.h into most places. Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.	2014-11-06 13:55:36 +02:00
Alvaro Herrera	f41872d0c1	Implement ALTER TABLE .. SET LOGGED / UNLOGGED This enables changing permanent (logged) tables to unlogged and vice-versa. (Docs for ALTER TABLE / SET TABLESPACE got shuffled in an order that hopefully makes more sense than the original.) Author: Fabrízio de Royes Mello Reviewed by: Christoph Berg, Andres Freund, Thom Brown Some tweaking by Álvaro Herrera	2014-08-22 14:27:00 -04:00
Alvaro Herrera	b7e51d9c06	Don't allow relminmxid to go backwards during VACUUM FULL We were allowing a table's pg_class.relminmxid value to move backwards when heaps were swapped by VACUUM FULL or CLUSTER. There is a similar protection against relfrozenxid going backwards, which we neglected to clone when the multixact stuff was rejiggered by commit `0ac5ad5134`. Backpatch to 9.3, where relminmxid was introduced. As reported by Heikki in http://www.postgresql.org/message-id/52401AEA.9000608@vmware.com	2014-06-27 14:43:46 -04:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Simon Riggs	e5550d5fec	Reduce lock levels of some ALTER TABLE cmds VALIDATE CONSTRAINT CLUSTER ON SET WITHOUT CLUSTER ALTER COLUMN SET STATISTICS ALTER COLUMN SET () ALTER COLUMN RESET () All other sub-commands use AccessExclusiveLock Simon Riggs and Noah Misch Reviews by Robert Haas and Andres Freund	2014-04-06 11:13:43 -04:00
Robert Haas	af2543e884	Allow VACUUM FULL/CLUSTER to bump freeze horizons even for pg_class. pg_class is a special case for CLUSTER and VACUUM FULL, so although commit `3cff1879f8` caused these operations to advance relfrozenxid and relminmxid for all other tables, it did not provide the same benefit for pg_class. This plugs that gap. Andres Freund	2014-03-04 11:08:18 -05:00
Robert Haas	b89e151054	Introduce logical decoding. This feature, building on previous commits, allows the write-ahead log stream to be decoded into a series of logical changes; that is, inserts, updates, and deletes and the transactions which contain them. It is capable of handling decoding even across changes to the schema of the effected tables. The output format is controlled by a so-called "output plugin"; an example is included. To make use of this in a real replication system, the output plugin will need to be modified to produce output in the format appropriate to that system, and to perform filtering. Currently, information can be extracted from the logical decoding system only via SQL; future commits will add the ability to stream changes via walsender. Andres Freund, with review and other contributions from many other people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan, Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve Singer.	2014-03-03 16:32:18 -05:00
Alvaro Herrera	801c2dc72c	Separate multixact freezing parameters from xid's Previously we were piggybacking on transaction ID parameters to freeze multixacts; but since there isn't necessarily any relationship between rates of Xid and multixact consumption, this turns out not to be a good idea. Therefore, we now have multixact-specific freezing parameters: vacuum_multixact_freeze_min_age: when to remove multis as we come across them in vacuum (default to 5 million, i.e. early in comparison to Xid's default of 50 million) vacuum_multixact_freeze_table_age: when to force whole-table scans instead of scanning only the pages marked as not all visible in visibility map (default to 150 million, same as for Xids). Whichever of both which reaches the 150 million mark earlier will cause a whole-table scan. autovacuum_multixact_freeze_max_age: when for cause emergency, uninterruptible whole-table scans (default to 400 million, double as that for Xids). This means there shouldn't be more frequent emergency vacuuming than previously, unless multixacts are being used very rapidly. Backpatch to 9.3 where multixacts were made to persist enough to require freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of fields in an unnatural place, and StdRdOptions is split in two so that the newly added fields can go at the end. Patch by me, reviewed by Robert Haas, with additional input from Andres Freund and Tom Lane.	2014-02-13 19:36:31 -03:00
Fujii Masao	5363c7f2bc	Fix typo in comment. Sawada Masahiko	2014-01-21 02:24:17 +09:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Robert Haas	3cff1879f8	Aggressively freeze tables when CLUSTER or VACUUM FULL rewrites them. We haven't wanted to do this in the past on the grounds that in rare cases the original xmin value will be needed for forensic purposes, but commit `37484ad2aa` removes that objection, so now we can. Per extensive discussion, among many people, on pgsql-hackers.	2014-01-02 15:15:51 -05:00
Alvaro Herrera	f54106f77e	Fix full-table-vacuum request mechanism for MultiXactIds While autovacuum dutifully launched anti-multixact-wraparound vacuums when the multixact "age" was reached, the vacuum code was not aware that it needed to make them be full table vacuums. As the resulting partial-table vacuums aren't capable of actually increasing relminmxid, autovacuum continued to launch anti-wraparound vacuums that didn't have the intended effect, until age of relfrozenxid caused the vacuum to finally be a full table one via vacuum_freeze_table_age. To fix, introduce logic for multixacts similar to that for plain TransactionIds, using the same GUCs. Backpatch to 9.3, where permanent MultiXactIds were introduced. Andres Freund, some cleanup by Álvaro	2013-11-29 21:47:13 -03:00
Robert Haas	8e18d04d4d	Refine our definition of what constitutes a system relation. Although user-defined relations can't be directly created in pg_catalog, it's possible for them to end up there, because you can create them in some other schema and then use ALTER TABLE .. SET SCHEMA to move them there. Previously, such relations couldn't afterwards be manipulated, because IsSystemRelation()/IsSystemClass() rejected all attempts to modify objects in the pg_catalog schema, regardless of their origin. With this patch, they now reject only those objects in pg_catalog which were created at initdb-time, allowing most operations on user-created tables in pg_catalog to proceed normally. This patch also adds new functions IsCatalogRelation() and IsCatalogClass(), which is similar to IsSystemRelation() and IsSystemClass() but with a slightly narrower definition: only TOAST tables of system catalogs are included, rather than all TOAST tables. This is currently used only for making decisions about when invalidation messages need to be sent, but upcoming logical decoding patches will find other uses for this information. Andres Freund, with some modifications by me.	2013-11-28 20:57:20 -05:00
Alvaro Herrera	dd778e9d88	Rename various "freeze multixact" variables It seems to make more sense to use "cutoff multixact" terminology throughout the backend code; "freeze" is associated with replacing of an Xid with FrozenTransactionId, which is not what we do for MultiXactIds. Andres Freund Some adjustments by Álvaro Herrera	2013-09-16 15:47:31 -03:00
Robert Haas	813fb03155	Remove SnapshotNow and HeapTupleSatisfiesNow. We now use MVCC catalog scans, and, per discussion, have eliminated all other remaining uses of SnapshotNow, so that we can now get rid of it. This will break third-party code which is still using it, which is intentional, as we want such code to be updated to do things the new way.	2013-08-01 10:46:19 -04:00
Robert Haas	0518eceec3	Adjust HeapTupleSatisfies* routines to take a HeapTuple. Previously, these functions took a HeapTupleHeader, but upcoming patches for logical replication will introduce new a new snapshot type under which the tuple's TID will be used to lookup (CMIN, CMAX) for visibility determination purposes. This makes that information available. Code churn is minimal since HeapTupleSatisfiesVisibility took the HeapTuple anyway, and deferenced it before calling the satisfies function. Independently of logical replication, this allows t_tableOid and t_self to be cross-checked via assertions in tqual.c. This seems like a useful way to make sure that all callers are setting these values properly, which has been previously put forward as desirable. Andres Freund, reviewed by Álvaro Herrera	2013-07-22 13:38:44 -04:00
Kevin Grittner	cc1965a99b	Add support for REFRESH MATERIALIZED VIEW CONCURRENTLY. This allows reads to continue without any blocking while a REFRESH runs. The new data appears atomically as part of transaction commit. Review questioned the Assert that a matview was not a system relation. This will be addressed separately. Reviewed by Hitoshi Harada, Robert Haas, Andres Freund. Merged after review with security patch `f3ab5d4`.	2013-07-16 12:55:44 -05:00
Fujii Masao	2ef085d0e6	Get rid of pg_class.reltoastidxid. Treat TOAST index just the same as normal one and get the OID of TOAST index from pg_index but not pg_class.reltoastidxid. This change allows us to handle multiple TOAST indexes, and which is required infrastructure for upcoming REINDEX CONCURRENTLY feature. Patch by Michael Paquier, reviewed by Andres Freund and me.	2013-07-04 03:24:09 +09:00
Robert Haas	568d4138c6	Use an MVCC snapshot, rather than SnapshotNow, for catalog scans. SnapshotNow scans have the undesirable property that, in the face of concurrent updates, the scan can fail to see either the old or the new versions of the row. In many cases, we work around this by requiring DDL operations to hold AccessExclusiveLock on the object being modified; in some cases, the existing locking is inadequate and random failures occur as a result. This commit doesn't change anything related to locking, but will hopefully pave the way to allowing lock strength reductions in the future. The major issue has held us back from making this change in the past is that taking an MVCC snapshot is significantly more expensive than using a static special snapshot such as SnapshotNow. However, testing of various worst-case scenarios reveals that this problem is not severe except under fairly extreme workloads. To mitigate those problems, we avoid retaking the MVCC snapshot for each new scan; instead, we take a new snapshot only when invalidation messages have been processed. The catcache machinery already requires that invalidation messages be sent before releasing the related heavyweight lock; else other backends might rely on locally-cached data rather than scanning the catalog at all. Thus, making snapshot reuse dependent on the same guarantees shouldn't break anything that wasn't already subtly broken. Patch by me. Review by Michael Paquier and Andres Freund.	2013-07-02 09:47:01 -04:00
Bruce Momjian	9af4159fce	pgindent run for release 9.3 This is the first run of the Perl-based pgindent script. Also update pgindent instructions.	2013-05-29 16:58:43 -04:00
Tom Lane	1d6c72a55b	Move materialized views' is-populated status into their pg_class entries. Previously this state was represented by whether the view's disk file had zero or nonzero size, which is problematic for numerous reasons, since it's breaking a fundamental assumption about heap storage. This was done to allow unlogged matviews to revert to unpopulated status after a crash despite our lack of any ability to update catalog entries post-crash. However, this poses enough risk of future problems that it seems better to not support unlogged matviews until we can find another way. Accordingly, revert that choice as well as a number of existing kluges forced by it in favor of creating a pg_class.relispopulated flag column.	2013-05-06 13:27:22 -04:00
Kevin Grittner	52e6e33ab4	Create a distinction between a populated matview and a scannable one. The intent was that being populated would, long term, be just one of the conditions which could affect whether a matview was scannable; being populated should be necessary but not always sufficient to scan the relation. Since only CREATE and REFRESH currently determine the scannability, names and comments accidentally conflated these concepts, leading to confusion. Also add missing locking for the SQL function which allows a test for scannability, and fix a modularity violatiion. Per complaints from Tom Lane, although its not clear that these will satisfy his concerns. Hopefully this will at least better frame the discussion.	2013-04-09 13:02:49 -05:00
Robert Haas	05f3f9c7b2	Extend object-access hook machinery to support post-alter events. This also slightly widens the scope of what we support in terms of post-create events. KaiGai Kohei, with a few changes, mostly to the comments, by me	2013-03-17 22:57:26 -04:00
Kevin Grittner	3bf3ab8c56	Add a materialized view relations. A materialized view has a rule just like a view and a heap and other physical properties like a table. The rule is only used to populate the table, references in queries refer to the materialized data. This is a minimal implementation, but should still be useful in many cases. Currently data is only populated "on demand" by the CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements. It is expected that future releases will add incremental updates with various timings, and that a more refined concept of defining what is "fresh" data will be developed. At some point it may even be possible to have queries use a materialized in place of references to underlying tables, but that requires the other above-mentioned features to be working first. Much of the documentation work by Robert Haas. Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja Security review by KaiGai Kohei, with a decision on how best to implement sepgsql still pending.	2013-03-03 18:23:31 -06:00
Alvaro Herrera	0ac5ad5134	Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov	2013-01-23 12:04:59 -03:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Tom Lane	3c84046490	Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY. Commit `8cb53654db`, which introduced DROP INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor choice of catalog state representation. The pg_index state for an index that's reached the final pre-drop stage was the same as the state for an index just created by CREATE INDEX CONCURRENTLY. This meant that the (necessary) change to make RelationGetIndexList ignore about-to-die indexes also made it ignore freshly-created indexes; which is catastrophic because the latter do need to be considered in HOT-safety decisions. Failure to do so leads to incorrect index entries and subsequently wrong results from queries depending on the concurrently-created index. To fix, add an additional boolean column "indislive" to pg_index, so that the freshly-created and about-to-die states can be distinguished. (This change obviously is only possible in HEAD. This patch will need to be back-patched, but in 9.2 we'll use a kluge consisting of overloading the formerly-impossible state of indisvalid = true and indisready = false.) In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index flag changes they make without exclusive lock on the index are made via heap_inplace_update() rather than a normal transactional update. The latter is not very safe because moving the pg_index tuple could result in concurrent SnapshotNow scans finding it twice or not at all, thus possibly resulting in index corruption. This is a pre-existing bug in CREATE INDEX CONCURRENTLY, which was copied into the DROP code. In addition, fix various places in the code that ought to check to make sure that the indexes they are manipulating are valid and/or ready as appropriate. These represent bugs that have existed since 8.2, since a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid index behind, and we ought not try to do anything that might fail with such an index. Also fix RelationReloadIndexInfo to ensure it copies all the pg_index columns that are allowed to change after initial creation. Previously we could have been left with stale values of some fields in an index relcache entry. It's not clear whether this actually had any user-visible consequences, but it's at least a bug waiting to happen. In addition, do some code and docs review for DROP INDEX CONCURRENTLY; some cosmetic code cleanup but mostly addition and revision of comments. This will need to be back-patched, but in a noticeably different form, so I'm committing it to HEAD before working on the back-patch. Problem reported by Amit Kapila, diagnosis by Pavan Deolassee, fix by Tom Lane and Andres Freund.	2012-11-28 21:26:01 -05:00
Alvaro Herrera	f4c4335a4a	Add context info to OAT_POST_CREATE security hook ... and have sepgsql use it to determine whether to check permissions during certain operations. Indexes that are being created as a result of REINDEX, for instance, do not need to have their permissions checked; they were already checked when the index was created. Author: KaiGai Kohei, slightly revised by me	2012-10-23 18:24:24 -03:00
Peter Eisentraut	b8b2e3b2de	Replace int2/int4 in C code with int16/int32 The latter was already the dominant use, and it's preferable because in C the convention is that intXX means XX bits. Therefore, allowing mixed use of int2, int4, int8, int16, int32 is obviously confusing. Remove the typedefs for int2 and int4 for now. They don't seem to be widely used outside of the PostgreSQL source tree, and the few uses can probably be cleaned up by the time this ships.	2012-06-25 01:51:46 +03:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Robert Haas	0e549697d1	Classify DROP operations by whether or not they are user-initiated. This doesn't do anything useful just yet, but is intended as supporting infrastructure for allowing sepgsql to sensibly check DROP permissions. KaiGai Kohei and Robert Haas	2012-01-26 09:30:27 -05:00
Tom Lane	21b446dd09	Fix CLUSTER/VACUUM FULL for toast values owned by recently-updated rows. In commit `7b0d0e9356`, I made CLUSTER and VACUUM FULL try to preserve toast value OIDs from the original toast table to the new one. However, if we have to copy both live and recently-dead versions of a row that has a toasted column, those versions may well reference the same toast value with the same OID. The patch then led to duplicate-key failures as we tried to insert the toast value twice with the same OID. (The previous behavior was not very desirable either, since it would have silently inserted the same value twice with different OIDs. That wastes space, but what's worse is that the toast values inserted for already-dead heap rows would not be reclaimed by subsequent ordinary VACUUMs, since they go into the new toast table marked live not deleted.) To fix, check if the copied OID already exists in the new toast table, and if so, assume that it stores the desired value. This is reasonably safe since the only case where we will copy an OID from a previous toast pointer is when toast_insert_or_update was given that toast pointer and so we just pulled the data from the old table; if we got two different values that way then we have big problems anyway. We do have to assume that no other backend is inserting items into the new toast table concurrently, but that's surely safe for CLUSTER and VACUUM FULL. Per bug #6393 from Maxim Boguk. Back-patch to 9.0, same as the previous patch.	2012-01-12 16:40:14 -05:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Robert Haas	cbe24a6dd8	Improve behavior of concurrent CLUSTER. In the previous coding, a user could queue up for an AccessExclusiveLock on a table they did not have permission to cluster, thus potentially interfering with access by authorized users who got stuck waiting behind the AccessExclusiveLock. This approach avoids that. cluster() has the same permissions-checking requirements as REINDEX TABLE, so this commit moves the now-shared callback to tablecmds.c and renames it, per discussion with Noah Misch.	2011-12-21 15:17:28 -05:00
Robert Haas	74a1d4fe7c	Improve behavior of concurrent rename statements. Previously, renaming a table, sequence, view, index, foreign table, column, or trigger checked permissions before locking the object, which meant that if permissions were revoked during the lock wait, we would still allow the operation. Similarly, if the original object is dropped and a new one with the same name is created, the operation will be allowed if we had permissions on the old object; the permissions on the new object don't matter. All this is now fixed. Along the way, attempting to rename a trigger on a foreign table now gives the same error message as trying to create one there in the first place (i.e. that it's not a table or view) rather than simply stating that no trigger by that name exists. Patch by me; review by Noah Misch.	2011-12-15 19:02:38 -05:00
Tom Lane	e6858e6657	Measure the number of all-visible pages for use in index-only scan costing. Add a column pg_class.relallvisible to remember the number of pages that were all-visible according to the visibility map as of the last VACUUM (or ANALYZE, or some other operations that update pg_class.relpages). Use relallvisible/relpages, instead of an arbitrary constant, to estimate how many heap page fetches can be avoided during an index-only scan. This is pretty primitive and will no doubt see refinements once we've acquired more field experience with the index-only scan mechanism, but it's way better than using a constant. Note: I had to adjust an underspecified query in the window.sql regression test, because it was changing answers when the plan changed to use an index-only scan. Some of the adjacent tests perhaps should be adjusted as well, but I didn't do that here.	2011-10-14 17:23:46 -04:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00

1 2 3 4 5 ...

274 Commits