postgresql

Commit Graph

Author	SHA1	Message	Date
Andres Freund	3bdcf6a5a7	Don't allow to disable backend assertions via the debug_assertions GUC. The existance of the assert_enabled variable (backing the debug_assertions GUC) reduced the amount of knowledge some static code checkers (like coverity and various compilers) could infer from the existance of the assertion. That could have been solved by optionally removing the assertion_enabled variable from the Assert() et al macros at compile time when some special macro is defined, but the resulting complication doesn't seem to be worth the gain from having debug_assertions. Recompiling is fast enough. The debug_assertions GUC is still available, but readonly, as it's useful when diagnosing problems. The commandline/client startup option -A, which previously also allowed to enable/disable assertions, has been removed as it doesn't serve a purpose anymore. While at it, reduce code duplication in bufmgr.c and localbuf.c assertions checking for spurious buffer pins. That code had to be reindented anyway to cope with the assert_enabled removal.	2014-06-20 11:09:17 +02:00
Alvaro Herrera	7937910781	Fix typos	2014-06-12 14:01:01 -04:00
Tom Lane	e416830a29	Prevent auto_explain from changing the output of a user's EXPLAIN. Commit `af7914c662`, which introduced the EXPLAIN (TIMING) option, for some reason coded explain.c to look at planstate->instrument->need_timer rather than es->timing to decide whether to print timing info. However, the former flag might get set as a result of contrib/auto_explain wanting timing information. We certainly don't want activation of auto_explain to change user-visible statement behavior, so fix that. Also fix an independent bug introduced in the same patch: in the code path for a never-executed node with a machine-friendly output format, if timing was selected, it would fail to print the Actual Rows and Actual Loops items. Per bug #10404 from Tomonari Katsumata. Back-patch to 9.2 where the faulty code was introduced.	2014-05-20 12:20:47 -04:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Tom Lane	2d00190495	Rationalize common/relpath.[hc]. Commit `a730183926` created rather a mess by putting dependencies on backend-only include files into include/common. We really shouldn't do that. To clean it up: * Move TABLESPACE_VERSION_DIRECTORY back to its longtime home in catalog/catalog.h. We won't consider this symbol part of the FE/BE API. * Push enum ForkNumber from relfilenode.h into relpath.h. We'll consider relpath.h as the source of truth for fork numbers, since relpath.c was already partially serving that function, and anyway relfilenode.h was kind of a random place for that enum. * So, relfilenode.h now includes relpath.h rather than vice-versa. This direction of dependency is fine. (That allows most, but not quite all, of the existing explicit #includes of relpath.h to go away again.) * Push forkname_to_number from catalog.c to relpath.c, just to centralize fork number stuff a bit better. * Push GetDatabasePath from catalog.c to relpath.c; it was rather odd that the previous commit didn't keep this together with relpath(). * To avoid needing relfilenode.h in common/, redefine the underlying function (now called GetRelationPath) as taking separate OID arguments, and make the APIs using RelFileNode or RelFileNodeBackend into macro wrappers. (The macros have a potential multiple-eval risk, but none of the existing call sites have an issue with that; one of them had such a risk already anyway.) * Fix failure to follow the directions when "init" fork type was added; specifically, the errhint in forkname_to_number wasn't updated, and neither was the SGML documentation for pg_relation_size(). * Fix tablespace-path-too-long check in CreateTableSpace() to account for fork-name component of maximum-length pathnames. This requires putting FORKNAMECHARS into a header file, but it was rather useless (and actually unreferenced) where it was. The last couple of items are potentially back-patchable bug fixes, if anyone is sufficiently excited about them; but personally I'm not. Per a gripe from Christoph Berg about how include/common wasn't self-contained.	2014-04-30 17:30:50 -04:00
Tom Lane	f0fedfe82c	Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.	2014-04-23 19:17:41 -04:00
Heikki Linnakangas	8d34f68628	Avoid transient bogus page contents when creating a sequence. Don't use simple_heap_insert to insert the tuple to a sequence relation. simple_heap_insert creates a heap insertion WAL record, and replaying that will create a regular heap page without the special area containing the sequence magic constant, which is wrong for a sequence. That was not a bug because we always created a sequence WAL record after that, and replaying that overwrote the bogus heap page, and the transient state could never be seen by another backend because it was only done when creating a new sequence relation. But it's simpler and cleaner to avoid that in the first place.	2014-04-22 10:40:23 +03:00
Heikki Linnakangas	2a8e1ac598	Set the all-visible flag on heap page before writing WAL record, not after. If we set the all-visible flag after writing WAL record, and XLogInsert takes a full-page image of the page, the image would not include the flag. We will then proceed to set the VM bit, which would then be set without the corresponding all-visible flag on the heap page. Found by comparing page images on master and standby, after writing/replaying each WAL record. (There is still a discrepancy: the all-visible flag won't be set after replaying the HEAP_CLEAN record, even though it is set in the master. However, it will be set when replaying the HEAP2_VISIBLE record and setting the VM bit, so the all-visible flag and VM bit are always consistent on the standby, even though they are momentarily out-of-sync with master) Backpatch to 9.3 where this code was introduced.	2014-04-17 17:47:50 +03:00
Tom Lane	5f86cbd714	Rename EXPLAIN ANALYZE's "total runtime" output to "execution time". Now that EXPLAIN also outputs a "planning time" measurement, the use of "total" here seems rather confusing: it sounds like it might include the planning time which of course it doesn't. Majority opinion was that "execution time" is a better label, so we'll call it that. This should be noted as a backwards incompatibility for tools that examine EXPLAIN ANALYZE output. In passing, I failed to resist the temptation to do a little editing on the materialized-view example affected by this change.	2014-04-16 20:48:59 -04:00
Tom Lane	e0c91a7ff0	Improve some O(N^2) behavior in window function evaluation. Repositioning the tuplestore seek pointer in window_gettupleslot() turns out to be a very significant expense when the window frame is sizable and the frame end can move. To fix, introduce a tuplestore function for skipping an arbitrary number of tuples in one call, parallel to the one we introduced for tuplesort objects in commit `8d65da1f`. This reduces the cost of window_gettupleslot() to O(1) if the tuplestore has not spilled to disk. As in the previous commit, I didn't try to do any real optimization of tuplestore_skiptuples for the case where the tuplestore has spilled to disk. There is probably no practical way to get the cost to less than O(N) anyway, but perhaps someone can think of something later. Also fix PersistHoldablePortal() to make use of this API now that we have it. Based on a suggestion by Dean Rasheed, though this turns out not to look much like his patch.	2014-04-13 13:59:17 -04:00
Stephen Frost	842faa714c	Make security barrier views automatically updatable Views which are marked as security_barrier must have their quals applied before any user-defined quals are called, to prevent user-defined functions from being able to see rows which the security barrier view is intended to prevent them from seeing. Remove the restriction on security barrier views being automatically updatable by adding a new securityQuals list to the RTE structure which keeps track of the quals from security barrier views at each level, independently of the user-supplied quals. When RTEs are later discovered which have securityQuals populated, they are turned into subquery RTEs which are marked as security_barrier to prevent any user-supplied quals being pushed down (modulo LEAKPROOF quals). Dean Rasheed, reviewed by Craig Ringer, Simon Riggs, KaiGai Kohei	2014-04-12 21:04:58 -04:00
Tom Lane	a9d9acbf21	Create infrastructure for moving-aggregate optimization. Until now, when executing an aggregate function as a window function within a window with moving frame start (that is, any frame start mode except UNBOUNDED PRECEDING), we had to recalculate the aggregate from scratch each time the frame head moved. This patch allows an aggregate definition to include an alternate "moving aggregate" implementation that includes an inverse transition function for removing rows from the aggregate's running state. As long as this can be done successfully, runtime is proportional to the total number of input rows, rather than to the number of input rows times the average frame length. This commit includes the core infrastructure, documentation, and regression tests using user-defined aggregates. Follow-on commits will update some of the built-in aggregates to use this feature. David Rowley and Florian Pflug, reviewed by Dean Rasheed; additional hacking by me	2014-04-12 12:03:30 -04:00
Simon Riggs	e5550d5fec	Reduce lock levels of some ALTER TABLE cmds VALIDATE CONSTRAINT CLUSTER ON SET WITHOUT CLUSTER ALTER COLUMN SET STATISTICS ALTER COLUMN SET () ALTER COLUMN RESET () All other sub-commands use AccessExclusiveLock Simon Riggs and Noah Misch Reviews by Robert Haas and Andres Freund	2014-04-06 11:13:43 -04:00
Tom Lane	abe075dfff	Fix tablespace creation WAL replay to work on Windows. The code segment that removes the old symlink (if present) wasn't clued into the fact that on Windows, symlinks are junction points which have to be removed with rmdir(). Backpatch to 9.0, where the failing code was introduced. MauMau, reviewed by Muhammad Asif Naeem and Amit Kapila	2014-04-04 23:09:35 -04:00
Noah Misch	7cbe57c34d	Offer triggers on foreign tables. This covers all the SQL-standard trigger types supported for regular tables; it does not cover constraint triggers. The approach for acquiring the old row mirrors that for view INSTEAD OF triggers. For AFTER ROW triggers, we spool the foreign tuples to a tuplestore. This changes the FDW API contract; when deciding which columns to populate in the slot returned from data modification callbacks, writable FDWs will need to check for AFTER ROW triggers in addition to checking for a RETURNING clause. In support of the feature addition, refactor the TriggerFlags bits and the assembly of old tuples in ModifyTable. Ronan Dunklau, reviewed by KaiGai Kohei; some additional hacking by me.	2014-03-23 02:16:34 -04:00
Noah Misch	6115480c54	Improve comments about AfterTriggerBeginQuery() query level usage.	2014-03-23 02:15:52 -04:00
Tom Lane	f7271c4427	Fix relcache reference leak in refresh_by_match_merge(). One path through the loop over indexes forgot to do index_close(). Rather than adding a fourth call, restructure slightly so that there's only one. In passing, get rid of an unnecessary syscache lookup: the pg_index struct for the index is already available from its relcache entry. Per report from YAMAMOTO Takashi, though this is a bit different from his suggested patch. This is new code in HEAD, so no need for back-patch.	2014-03-18 11:36:53 -04:00
Tom Lane	7bae0284ee	Avoid transaction-commit race condition while receiving a NOTIFY message. Use TransactionIdIsInProgress, then TransactionIdDidCommit, to distinguish whether a NOTIFY message's originating transaction is in progress, committed, or aborted. The previous coding could accept a message from a transaction that was still in-progress according to the PGPROC array; if the client were fast enough at starting a new transaction, it might fail to see table rows added/updated by the message-sending transaction. Which of course would usually be the point of receiving the message. We noted this type of race condition long ago in tqual.c, but async.c overlooked it. The race condition probably cannot occur unless there are multiple NOTIFY senders in action, since an individual backend doesn't send NOTIFY signals until well after it's done committing. But if two senders commit in close succession, it's certainly possible that we could see the second sender's message within the race condition window while responding to the signal from the first one. Per bug #9557 from Marko Tiikkaja. This patch is slightly more invasive than what he proposed, since it removes the now-redundant TransactionIdDidAbort call. Back-patch to 9.0, where the current NOTIFY implementation was introduced.	2014-03-13 12:02:54 -04:00
Bruce Momjian	5024044a20	C comments: improve description of relfilenode uniqueness Report by Antonin Houska	2014-03-08 12:20:30 -05:00
Tom Lane	7c31874945	Avoid getting more than AccessShareLock when deparsing a query. In make_ruledef and get_query_def, we have long used AcquireRewriteLocks to ensure that the querytree we are about to deparse is up-to-date and the schemas of the underlying relations aren't changing. Howwever, that function thinks the query is about to be executed, so it acquires locks that are stronger than necessary for the purpose of deparsing. Thus for example, if pg_dump asks to deparse a rule that includes "INSERT INTO t", we'd acquire RowExclusiveLock on t. That results in interference with concurrent transactions that might for example ask for ShareLock on t. Since pg_dump is documented as being purely read-only, this is unexpected. (Worse, it used to actually be read-only; this behavior dates back only to 8.1, cf commit ba4200246.) Fix this by adding a parameter to AcquireRewriteLocks to tell it whether we want the "real" execution locks or only AccessShareLock. Report, diagnosis, and patch by Dean Rasheed. Back-patch to all supported branches.	2014-03-06 19:31:05 -05:00
Andrew Dunstan	3b5e03dca2	Provide a FORCE NULL option to COPY in CSV mode. This forces an input field containing the quoted null string to be returned as a NULL. Without this option, only unquoted null strings behave this way. This helps where some CSV producers insist on quoting every field, whether or not it is needed. The option takes a list of fields, and only applies to those columns. There is an equivalent column-level option added to file_fdw. Ian Barwick, with some tweaking by Andrew Dunstan, reviewed by Payal Singh.	2014-03-04 17:31:59 -05:00
Robert Haas	af2543e884	Allow VACUUM FULL/CLUSTER to bump freeze horizons even for pg_class. pg_class is a special case for CLUSTER and VACUUM FULL, so although commit `3cff1879f8` caused these operations to advance relfrozenxid and relminmxid for all other tables, it did not provide the same benefit for pg_class. This plugs that gap. Andres Freund	2014-03-04 11:08:18 -05:00
Robert Haas	b89e151054	Introduce logical decoding. This feature, building on previous commits, allows the write-ahead log stream to be decoded into a series of logical changes; that is, inserts, updates, and deletes and the transactions which contain them. It is capable of handling decoding even across changes to the schema of the effected tables. The output format is controlled by a so-called "output plugin"; an example is included. To make use of this in a real replication system, the output plugin will need to be modified to produce output in the format appropriate to that system, and to perform filtering. Currently, information can be extracted from the logical decoding system only via SQL; future commits will add the ability to stream changes via walsender. Andres Freund, with review and other contributions from many other people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan, Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve Singer.	2014-03-03 16:32:18 -05:00
Robert Haas	cf6aa68bbd	Update a few comments to mention materialized views. Etsuro Fujita	2014-02-25 13:40:12 -05:00
Tom Lane	769065c1b2	Prefer pg_any_to_server/pg_server_to_any over pg_do_encoding_conversion. A large majority of the callers of pg_do_encoding_conversion were specifying the database encoding as either source or target of the conversion, meaning that we can use the less general functions pg_any_to_server/pg_server_to_any instead. The main advantage of using the latter functions is that they can make use of a cached conversion-function lookup in the common case that the other encoding is the current client_encoding. It's notationally cleaner too in most cases, not least because of the historical artifact that the latter functions use "char " rather than "unsigned char " in their APIs. Note that pg_any_to_server will apply an encoding verification step in some cases where pg_do_encoding_conversion would have just done nothing. This seems to me to be a good idea at most of these call sites, though it partially negates the performance benefit. Per discussion of bug #9210.	2014-02-23 16:59:05 -05:00
Robert Haas	5f173040e3	Avoid repeated name lookups during table and index DDL. If the name lookups come to different conclusions due to concurrent activity, we might perform some parts of the DDL on a different table than other parts. At least in the case of CREATE INDEX, this can be used to cause the permissions checks to be performed against a different table than the index creation, allowing for a privilege escalation attack. This changes the calling convention for DefineIndex, CreateTrigger, transformIndexStmt, transformAlterTableStmt, CheckIndexCompatible (in 9.2 and newer), and AlterTable (in 9.1 and older). In addition, CheckRelationOwnership is removed in 9.2 and newer and the calling convention is changed in older branches. A field has also been added to the Constraint node (FkConstraint in 8.4). Third-party code calling these functions or using the Constraint node will require updating. Report by Andres Freund. Patch by Robert Haas and Andres Freund, reviewed by Tom Lane. Security: CVE-2014-0062	2014-02-17 09:33:31 -05:00
Noah Misch	537cbd35c8	Prevent privilege escalation in explicit calls to PL validators. The primary role of PL validators is to be called implicitly during CREATE FUNCTION, but they are also normal functions that a user can call explicitly. Add a permissions check to each validator to ensure that a user cannot use explicit validator calls to achieve things he could not otherwise achieve. Back-patch to 8.4 (all supported versions). Non-core procedural language extensions ought to make the same two-line change to their own validators. Andres Freund, reviewed by Tom Lane and Noah Misch. Security: CVE-2014-0061	2014-02-17 09:33:31 -05:00
Noah Misch	fea164a72a	Shore up ADMIN OPTION restrictions. Granting a role without ADMIN OPTION is supposed to prevent the grantee from adding or removing members from the granted role. Issuing SET ROLE before the GRANT bypassed that, because the role itself had an implicit right to add or remove members. Plug that hole by recognizing that implicit right only when the session user matches the current role. Additionally, do not recognize it during a security-restricted operation or during execution of a SECURITY DEFINER function. The restriction on SECURITY DEFINER is not security-critical. However, it seems best for a user testing his own SECURITY DEFINER function to see the same behavior others will see. Back-patch to 8.4 (all supported versions). The SQL standards do not conflate roles and users as PostgreSQL does; only SQL roles have members, and only SQL users initiate sessions. An application using PostgreSQL users and roles as SQL users and roles will never attempt to grant membership in the role that is the session user, so the implicit right to add or remove members will never arise. The security impact was mostly that a role member could revoke access from others, contrary to the wishes of his own grantor. Unapproved role member additions are less notable, because the member can still largely achieve that by creating a view or a SECURITY DEFINER function. Reviewed by Andres Freund and Tom Lane. Reported, independently, by Jonas Sundman and Noah Misch. Security: CVE-2014-0060	2014-02-17 09:33:31 -05:00
Alvaro Herrera	801c2dc72c	Separate multixact freezing parameters from xid's Previously we were piggybacking on transaction ID parameters to freeze multixacts; but since there isn't necessarily any relationship between rates of Xid and multixact consumption, this turns out not to be a good idea. Therefore, we now have multixact-specific freezing parameters: vacuum_multixact_freeze_min_age: when to remove multis as we come across them in vacuum (default to 5 million, i.e. early in comparison to Xid's default of 50 million) vacuum_multixact_freeze_table_age: when to force whole-table scans instead of scanning only the pages marked as not all visible in visibility map (default to 150 million, same as for Xids). Whichever of both which reaches the 150 million mark earlier will cause a whole-table scan. autovacuum_multixact_freeze_max_age: when for cause emergency, uninterruptible whole-table scans (default to 400 million, double as that for Xids). This means there shouldn't be more frequent emergency vacuuming than previously, unless multixacts are being used very rapidly. Backpatch to 9.3 where multixacts were made to persist enough to require freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of fields in an unnatural place, and StdRdOptions is split in two so that the newly added fields can go at the end. Patch by me, reviewed by Robert Haas, with additional input from Andres Freund and Tom Lane.	2014-02-13 19:36:31 -03:00
Peter Eisentraut	66c04c981d	Mark some more variables as static or include the appropriate header Detected by clang's -Wmissing-variable-declarations. From: Andres Freund <andres@anarazel.de>	2014-02-08 21:21:46 -05:00
Tom Lane	571addd729	Fix unsafe references to errno within error messaging logic. Various places were supposing that errno could be expected to hold still within an ereport() nest or similar contexts. This isn't true necessarily, though in some cases it accidentally failed to fail depending on how the compiler chanced to order the subexpressions. This class of thinko explains recent reports of odd failures on clang-built versions, typically missing or inappropriate HINT fields in messages. Problem identified by Christian Kruse, who also submitted the patch this commit is based on. (I fixed a few issues in his patch and found a couple of additional places with the same disease.) Back-patch as appropriate to all supported branches.	2014-01-29 20:04:43 -05:00
Robert Haas	9347baa5bb	Include planning time in EXPLAIN ANALYZE output. This doesn't work for prepared queries, but it's not too easy to get the information in that case and there's some debate as to exactly what the right thing to measure is, so just do this for now. Andreas Karlsson, with slight doc changes by me.	2014-01-29 16:09:15 -05:00
Stephen Frost	fbe19ee3b8	ALTER TABLESPACE ... MOVE ... OWNED BY Add the ability to specify the objects to move by who those objects are owned by (as relowner) and change ALL to mean ALL objects. This makes the command always operate against a well-defined set of objects and not have the objects-to-be-moved based on the role of the user running the command. Per discussion with Simon and Tom.	2014-01-23 23:52:40 -05:00
Alvaro Herrera	b152c6cd0d	Make DROP IF EXISTS more consistently not fail Some cases were still reporting errors and aborting, instead of a NOTICE that the object was being skipped. This makes it more difficult to cleanly handle pg_dump --clean, so change that to instead skip missing objects properly. Per bug #7873 reported by Dave Rolsky; apparently this affects a large number of users. Authors: Pavel Stehule and Dean Rasheed. Some tweaks by Álvaro Herrera	2014-01-23 14:40:29 -03:00
Alvaro Herrera	d2458e3b20	Expose a routine to print triggers during EXPLAIN ANALYZE This is so that auto_explain can use it. Kyotaro HORIGUCHI	2014-01-20 17:13:47 -03:00
Fujii Masao	5363c7f2bc	Fix typo in comment. Sawada Masahiko	2014-01-21 02:24:17 +09:00
Simon Riggs	4d1e2aeb1a	Speed up COPY into tables with DEFAULT nextval() Previously the presence of a nextval() prevented the use of batch-mode COPY. This patch introduces a special case just for nextval() functions. In future we will introduce a general case solution for labelling volatile functions as safe for use.	2014-01-20 17:22:38 +00:00
Stephen Frost	5254958e92	Add CREATE TABLESPACE ... WITH ... Options Tablespaces have a few options which can be set on them to give PG hints as to how the tablespace behaves (perhaps it's faster for sequential scans, or better able to handle random access, etc). These options were only available through the ALTER TABLESPACE command. This adds the ability to set these options at CREATE TABLESPACE time, removing the need to do both a CREATE TABLESPACE and ALTER TABLESPACE to get the correct options set on the tablespace. Vik Fearing, reviewed by Michael Paquier.	2014-01-18 20:59:31 -05:00
Tom Lane	115f414124	Fix VACUUM's reporting of dead-tuple counts to the stats collector. Historically, VACUUM has just reported its new_rel_tuples estimate (the same thing it puts into pg_class.reltuples) to the stats collector. That number counts both live and dead-but-not-yet-reclaimable tuples. This behavior may once have been right, but modern versions of the pgstats code track live and dead tuple counts separately, so putting the total into n_live_tuples and zero into n_dead_tuples is surely pretty bogus. Fix it to report live and dead tuple counts separately. This doesn't really do much for situations where updating transactions commit concurrently with a VACUUM scan (possibly causing double-counting or omission of the tuples they add or delete); but it's clearly an improvement over what we were doing before. Hari Babu, reviewed by Amit Kapila	2014-01-18 19:24:33 -05:00
Stephen Frost	76e91b38ba	Add ALTER TABLESPACE ... MOVE command This adds a 'MOVE' sub-command to ALTER TABLESPACE which allows moving sets of objects from one tablespace to another. This can be extremely handy and avoids a lot of error-prone scripting. ALTER TABLESPACE ... MOVE will only move objects the user owns, will notify the user if no objects were found, and can be used to move ALL objects or specific types of objects (TABLES, INDEXES, or MATERIALIZED VIEWS).	2014-01-18 18:56:40 -05:00
Stephen Frost	6f25c62d78	Allow SET TABLESPACE to database default We've always allowed CREATE TABLE to create tables in the database's default tablespace without checking for CREATE permissions on that tablespace. Unfortunately, the original implementation of ALTER TABLE ... SET TABLESPACE didn't pick up on that exception. This changes ALTER TABLE ... SET TABLESPACE to allow the database's default tablespace without checking for CREATE rights on that tablespace, just as CREATE TABLE works today. Users could always do this through a series of commands (CREATE TABLE ... AS SELECT * FROM ...; DROP TABLE ...; etc), so let's fix the oversight in SET TABLESPACE's original implementation.	2014-01-18 18:41:52 -05:00
Tom Lane	0d79c0a8cc	Make various variables const (read-only). These changes should generally improve correctness/maintainability. A nice side benefit is that several kilobytes move from initialized data to text segment, allowing them to be shared across processes and probably reducing copy-on-write overhead while forking a new backend. Unfortunately this doesn't seem to help libpq in the same way (at least not when it's compiled with -fpic on x86_64), but we can hope the linker at least collects all nominally-const data together even if it's not actually part of the text segment. Also, make pg_encname_tbl[] static in encnames.c, since there seems no very good reason for any other code to use it; per a suggestion from Wim Lewis, who independently submitted a patch that was mostly a subset of this one. Oskari Saarenmaa, with some editorialization by me	2014-01-18 16:04:32 -05:00
Robert Haas	2bb1f14b89	Make bitmap heap scans show exact/lossy block info in EXPLAIN ANALYZE. Etsuro Fujita	2014-01-13 14:42:16 -05:00
Tom Lane	6286526207	Fix compute_scalar_stats() for case that all values exceed WIDTH_THRESHOLD. The standard typanalyze functions skip over values whose detoasted size exceeds WIDTH_THRESHOLD (1024 bytes), so as to limit memory bloat during ANALYZE. However, we (I think I, actually :-() failed to consider the possibility that every non-null value in a column is too wide. While compute_minimal_stats() seems to behave reasonably anyway in such a case, compute_scalar_stats() just fell through and generated no pg_statistic entry at all. That's unnecessarily pessimistic: we can still produce valid stanullfrac and stawidth values in such cases, since we do include too-wide values in the average-width calculation. Furthermore, since the general assumption in this code is that too-wide values are probably all distinct from each other, it seems reasonable to set stadistinct to -1 ("all distinct"). Per complaint from Kadri Raudsepp. This has been like this since roughly neolithic times, so back-patch to all supported branches.	2014-01-11 13:42:42 -05:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Peter Eisentraut	edc43458d7	Add more use of psprintf()	2014-01-06 21:30:26 -05:00
Alvaro Herrera	1a3e82a7f9	Restore some comments lost during `15732b34e8` Michael Paquier	2014-01-03 13:22:03 -03:00
Robert Haas	3cff1879f8	Aggressively freeze tables when CLUSTER or VACUUM FULL rewrites them. We haven't wanted to do this in the past on the grounds that in rare cases the original xmin value will be needed for forensic purposes, but commit `37484ad2aa` removes that objection, so now we can. Per extensive discussion, among many people, on pgsql-hackers.	2014-01-02 15:15:51 -05:00
Tom Lane	c01bc51f8d	Fix broken support for event triggers as extension members. CREATE EVENT TRIGGER forgot to mark the event trigger as a member of its extension, and pg_dump didn't pay any attention anyway when deciding whether to dump the event trigger. Per report from Moshe Jacobson. Given the obvious lack of testing here, it's rather astonishing that ALTER EXTENSION ADD/DROP EVENT TRIGGER work, but they seem to.	2013-12-30 14:00:02 -05:00
Tom Lane	8d65da1f01	Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane	2013-12-23 16:11:35 -05:00
Robert Haas	37484ad2aa	Change the way we mark tuples as frozen. Instead of changing the tuple xmin to FrozenTransactionId, the combination of HEAP_XMIN_COMMITTED and HEAP_XMIN_INVALID, which were previously never set together, is now defined as HEAP_XMIN_FROZEN. A variety of previous proposals to freeze tuples opportunistically before vacuum_freeze_min_age is reached have foundered on the objection that replacing xmin by FrozenTransactionId might hinder debugging efforts when things in this area go awry; this patch is intended to solve that problem by keeping the XID around (but largely ignoring the value to which it is set). Third-party code that checks for HEAP_XMIN_INVALID on tuples where HEAP_XMIN_COMMITTED might be set will be broken by this change. To fix, use the new accessor macros in htup_details.h rather than consulting the bits directly. HeapTupleHeaderGetXmin has been modified to return FrozenTransactionId when the infomask bits indicate that the tuple is frozen; use HeapTupleHeaderGetRawXmin when you already know that the tuple isn't marked commited or frozen, or want the raw value anyway. We currently do this in routines that display the xmin for user consumption, in tqual.c where it's known to be safe and important for the avoidance of extra cycles, and in the function-caching code for various procedural languages, which shouldn't invalidate the cache just because the tuple gets frozen. Robert Haas and Andres Freund	2013-12-22 15:49:09 -05:00
Bruce Momjian	527fdd9df1	Move pg_upgrade_support global variables to their own include file Previously their declarations were spread around to avoid accidental access.	2013-12-19 16:10:07 -05:00
Robert Haas	001a573a20	Allow on-detach callbacks for dynamic shared memory segments. Just as backends must clean up their shared memory state (releasing lwlocks, buffer pins, etc.) before exiting, they must also perform any similar cleanups related to dynamic shared memory segments they have mapped before unmapping those segments. So add a mechanism to ensure that. Existing on_shmem_exit hooks include both "user level" cleanup such as transaction abort and removal of leftover temporary relations and also "low level" cleanup that forcibly released leftover shared memory resources. On-detach callbacks should run after the first group but before the second group, so create a new before_shmem_exit function for registering the early callbacks and keep on_shmem_exit for the regular callbacks. (An earlier draft of this patch added an additional argument to on_shmem_exit, but that had a much larger footprint and probably a substantially higher risk of breaking third party code for no real gain.) Patch by me, reviewed by KaiGai Kohei and Andres Freund.	2013-12-18 13:09:09 -05:00
Bruce Momjian	dba5a9dda9	Comment: COPY comment improvement Etsuro Fujita	2013-12-17 12:51:16 -05:00
Alvaro Herrera	3b97e6823b	Rework tuple freezing protocol Tuple freezing was broken in connection to MultiXactIds; commit `8e53ae025d` tried to fix it, but didn't go far enough. As noted by Noah Misch, freezing a tuple whose Xmax is a multi containing an aborted update might cause locks in the multi to go ignored by later transactions. This is because the code depended on a multixact above their cutoff point not having any lock-only member older than the cutoff point for Xids, which is easily defeated in READ COMMITTED transactions. The fix for this involves creating a new MultiXactId when necessary. But this cannot be done during WAL replay, and moreover multixact examination requires using CLOG access routines which are not supposed to be used during WAL replay either; so tuple freezing cannot be done with the old freeze WAL record. Therefore, separate the freezing computation from its execution, and change the WAL record to carry all necessary information. At WAL replay time, it's easy to re-execute freezing because we don't need to re-compute the new infomask/Xmax values but just take them from the WAL record. While at it, restructure the coding to ensure all page changes occur in a single critical section without much room for failures. The previous coding wasn't using a critical section, without any explanation as to why this was acceptable. In replication scenarios using the 9.3 branch, standby servers must be upgraded before their master, so that they are prepared to deal with the new WAL record once the master is upgraded; failure to do so will cause WAL replay to die with a PANIC message. Later upgrade of the standby will allow the process to continue where it left off, so there's no disruption of the data in the standby in any case. Standbys know how to deal with the old WAL record, so it's okay to keep the master running the old code for a while. In master, the old freeze WAL record is gone, for cleanliness' sake; there's no compatibility concern there. Backpatch to 9.3, where the original bug was introduced and where the previous fix was backpatched. Álvaro Herrera and Andres Freund	2013-12-16 11:29:50 -03:00
Tom Lane	2efc6dc256	Add HOLD/RESUME_INTERRUPTS in HandleCatchupInterrupt/HandleNotifyInterrupt. This prevents a possible longjmp out of the signal handler if a timeout or SIGINT occurs while something within the handler has transiently set ImmediateInterruptOK. For safety we must hold off the timeout or cancel error until we're back in mainline, or at least till we reach the end of the signal handler when ImmediateInterruptOK was true at entry. This syncs these functions with the logic now present in handle_sig_alarm. AFAICT there is no live bug here in 9.0 and up, because I don't think we currently can wait for any heavyweight lock inside these functions, and there is no other code (except read-from-client) that will turn on ImmediateInterruptOK. However, that was not true pre-9.0: in older branches ProcessIncomingNotify might block trying to lock pg_listener, and then a SIGINT could lead to undesirable control flow. It might be all right anyway given the relatively narrow code ranges in which NOTIFY interrupts are enabled, but for safety's sake I'm back-patching this.	2013-12-13 14:05:51 -05:00
Heikki Linnakangas	dde6282500	Fix more instances of "the the" in comments. Plus one instance of "to to" in the docs.	2013-12-13 20:02:01 +02:00
Heikki Linnakangas	a49633d8dc	Fix WAL-logging of setting the visibility map bit. The operation that removes the remaining dead tuples from the page must be WAL-logged before the setting of the VM bit. Otherwise, if you replay the WAL to between those two records, you end up with the VM bit set, but the dead tuples are still there. Backpatch to 9.3, where this bug was introduced.	2013-12-13 14:15:04 +02:00
Tom Lane	f26099057a	Improve EXPLAIN to print the grouping columns in Agg and Group nodes. Per request from Kevin Grittner.	2013-12-12 11:24:38 -05:00
Simon Riggs	8693559cac	New autovacuum_work_mem parameter If autovacuum_work_mem is set, autovacuum workers now use this parameter in preference to maintenance_work_mem. Peter Geoghegan	2013-12-12 11:42:39 +00:00
Robert Haas	66abc2608c	Add a new reloption, user_catalog_table. When this reloption is set and wal_level=logical is configured, we'll record the CIDs stamped by inserts, updates, and deletes to the table just as we would for an actual catalog table. This will allow logical decoding to use historical MVCC snapshots to access such tables just as they access ordinary catalog tables. Replication solutions built around the logical decoding machinery will likely need to set this operation for their configuration tables; it might also be needed by extensions which perform table access in their output functions. Andres Freund, reviewed by myself and others.	2013-12-10 19:17:34 -05:00
Robert Haas	e55704d8b2	Add new wal_level, logical, sufficient for logical decoding. When wal_level=logical, we'll log columns from the old tuple as configured by the REPLICA IDENTITY facility added in commit `07cacba983`. This makes it possible a properly-configured logical replication solution to correctly follow table updates even if they change the chosen key columns, or, with REPLICA IDENTITY FULL, even if the table has no key at all. Note that updates which do not modify the replica identity column won't log anything extra, making the choice of a good key (i.e. one that will rarely be changed) important to performance when wal_level=logical is configured. Each insert, update, or delete to a catalog table will also log the CMIN and/or CMAX values of stamped by the current transaction. This is necessary because logical decoding will require access to historical snapshots of the catalog in order to decode some data types, and the CMIN/CMAX values that we may need in order to judge row visibility may have been overwritten by the time we need them. Andres Freund, reviewed in various versions by myself, Heikki Linnakangas, KONDO Mitsumasa, and many others.	2013-12-10 19:01:40 -05:00
Heikki Linnakangas	9e857436ef	Don't include unused space in LOG_NEWPAGE records. This is the same trick we use when taking a full page image of a buffer passed to XLogInsert.	2013-12-04 00:10:47 +02:00
Alvaro Herrera	f54106f77e	Fix full-table-vacuum request mechanism for MultiXactIds While autovacuum dutifully launched anti-multixact-wraparound vacuums when the multixact "age" was reached, the vacuum code was not aware that it needed to make them be full table vacuums. As the resulting partial-table vacuums aren't capable of actually increasing relminmxid, autovacuum continued to launch anti-wraparound vacuums that didn't have the intended effect, until age of relfrozenxid caused the vacuum to finally be a full table one via vacuum_freeze_table_age. To fix, introduce logic for multixacts similar to that for plain TransactionIds, using the same GUCs. Backpatch to 9.3, where permanent MultiXactIds were introduced. Andres Freund, some cleanup by Álvaro	2013-11-29 21:47:13 -03:00
Robert Haas	8e18d04d4d	Refine our definition of what constitutes a system relation. Although user-defined relations can't be directly created in pg_catalog, it's possible for them to end up there, because you can create them in some other schema and then use ALTER TABLE .. SET SCHEMA to move them there. Previously, such relations couldn't afterwards be manipulated, because IsSystemRelation()/IsSystemClass() rejected all attempts to modify objects in the pg_catalog schema, regardless of their origin. With this patch, they now reject only those objects in pg_catalog which were created at initdb-time, allowing most operations on user-created tables in pg_catalog to proceed normally. This patch also adds new functions IsCatalogRelation() and IsCatalogClass(), which is similar to IsSystemRelation() and IsSystemClass() but with a slightly narrower definition: only TOAST tables of system catalogs are included, rather than all TOAST tables. This is currently used only for making decisions about when invalidation messages need to be sent, but upcoming logical decoding patches will find other uses for this information. Andres Freund, with some modifications by me.	2013-11-28 20:57:20 -05:00
Heikki Linnakangas	82b43f7df2	Don't update relfrozenxid if any pages were skipped. Vacuum recognizes that it can update relfrozenxid by checking whether it has processed all pages of a relation. Unfortunately it performed that check after truncating the dead pages at the end of the relation, and used the new number of pages to decide whether all pages have been scanned. If the new number of pages happened to be smaller or equal to the number of pages scanned, it incorrectly decided that all pages were scanned. This can lead to relfrozenxid being updated, even though some pages were skipped that still contain old XIDs. That can lead to data loss due to xid wraparounds with some rows suddenly missing. This likely has escaped notice so far because it takes a large number (~2^31) of xids being used to see the effect, while a full-table vacuum before that would fix the issue. The incorrect logic was introduced by commit `b4b6923e03`. Backpatch this fix down to 8.4, like that commit. Andres Freund, with some modifications by me.	2013-11-27 13:43:27 +02:00
Tom Lane	784e762e88	Support multi-argument UNNEST(), and TABLE() syntax for multiple functions. This patch adds the ability to write TABLE( function1(), function2(), ...) as a single FROM-clause entry. The result is the concatenation of the first row from each function, followed by the second row from each function, etc; with NULLs inserted if any function produces fewer rows than others. This is believed to be a much more useful behavior than what Postgres currently does with multiple SRFs in a SELECT list. This syntax also provides a reasonable way to combine use of column definition lists with WITH ORDINALITY: put the column definition list inside TABLE(), where it's clear that it doesn't control the ordinality column as well. Also implement SQL-compliant multiple-argument UNNEST(), by turning UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)). The SQL standard specifies TABLE() with only a single function, not multiple functions, and it seems to require an implicit UNNEST() which is not what this patch does. There may be something wrong with that reading of the spec, though, because if it's right then the spec's TABLE() is just a pointless alternative spelling of UNNEST(). After further review of that, we might choose to adopt a different syntax for what this patch does, but in any case this functionality seems clearly worthwhile. Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and significantly revised by me	2013-11-21 19:37:20 -05:00
Heikki Linnakangas	4c697d8f48	Count locked pages that don't need vacuuming as scanned. Previously, if VACUUM skipped vacuuming a page because it's pinned, it didn't count that page as scanned. However, that meant that relfrozenxid was not bumped up either, which prevented anti-wraparound vacuum from doing its job. Report by Миша Тюрин, analysis and patch by Sergey Burladyn and Jeff Janes. Backpatch to 9.2, where the skip-locked-pages behavior was introduced.	2013-11-18 09:51:09 +02:00
Tom Lane	6cb86143e8	Allow aggregates to provide estimates of their transition state data size. Formerly the planner had a hard-wired rule of thumb for guessing the amount of space consumed by an aggregate function's transition state data. This estimate is critical to deciding whether it's OK to use hash aggregation, and in many situations the built-in estimate isn't very good. This patch adds a column to pg_aggregate wherein a per-aggregate estimate can be provided, overriding the planner's default, and infrastructure for setting the column via CREATE AGGREGATE. It may be that additional smarts will be required in future, perhaps even a per-aggregate estimation function. But this is already a step forward. This is extracted from a larger patch to improve the performance of numeric and int8 aggregates. I (tgl) thought it was worth reviewing and committing this infrastructure separately. In this commit, all built-in aggregates are given aggtransspace = 0, so no behavior should change. Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra	2013-11-16 16:03:40 -05:00
Tom Lane	80e3a470ba	Minor comment corrections for sequence hashtable patch. There were enough typos in the comments to annoy me ...	2013-11-15 12:17:12 -05:00
Heikki Linnakangas	5cb719beee	Fix bogus hash table creation. Andres Freund	2013-11-15 14:23:40 +02:00
Heikki Linnakangas	21025d4a53	Use a hash table to store current sequence values. This speeds up nextval() and currval(), when you touch a lot of different sequences in the same backend. David Rowley	2013-11-15 12:29:38 +02:00
Peter Eisentraut	001e114b8d	Fix whitespace issues found by git diff --check, add gitattributes Set per file type attributes in .gitattributes to fine-tune whitespace checks. With the associated cleanups, the tree is now clean for git	2013-11-10 14:48:29 -05:00
Robert Haas	07cacba983	Add the notion of REPLICA IDENTITY for a table. Pending patches for logical replication will use this to determine which columns of a tuple ought to be considered as its candidate key. Andres Freund, with minor, mostly cosmetic adjustments by me	2013-11-08 12:30:43 -05:00
Tom Lane	060b22a99a	Fix subtly-wrong volatility checking in BeginCopyFrom(). contain_volatile_functions() is best applied to the output of expression_planner(), not its input, so that insertion of function default arguments and constant-folding have been done. (See comments at CheckMutability, for instance.) It's perhaps unlikely that anyone will notice a difference in practice, but still we should do it properly. In passing, change variable type from Node* to Expr* to reduce the net number of casts needed. Noted while perusing uses of contain_volatile_functions().	2013-11-08 08:59:39 -05:00
Kevin Grittner	5829082a57	Keep heap open until new heap generated in RMV. Early close became apparent when invalidation messages were processed in a new location under CLOBBER_CACHE_ALWAYS builds, due to additional locking. Back-patch to 9.3	2013-11-06 12:27:52 -06:00
Kevin Grittner	2636ecf78b	Lock relation used to generate fresh data for RMV. The relation should not be accessible to any other process, but it should be locked for consistency. Since this is not known to cause any bug, it will not be back-patch, at least for now. Per report from Andres Freund	2013-11-05 15:36:33 -06:00
Kevin Grittner	2a781d57dc	Acquire appropriate locks when rewriting during RMV. Since the query has not been freshly parsed when executing REFRESH MATERIALIZED VIEW, locks must be explicitly taken before rewrite. Backpatch to 9.3. Andres Freund	2013-11-02 19:18:08 -05:00
Tom Lane	45f64f1bbf	Remove CTimeZone/HasCTZSet, root and branch. These variables no longer have any useful purpose, since there's no reason to special-case brute force timezones now that we have a valid session_timezone setting for them. Remove the variables, and remove the SET/SHOW TIME ZONE code that deals with them. The user-visible impact of this is that SHOW TIME ZONE will now show a POSIX-style zone specification, in the form "<+-offset>-+offset", rather than an interval value when a brute-force zone has been set. While perhaps less intuitive, this is a better definition than before because it's actually possible to give that string back to SET TIME ZONE and get the same behavior, unlike what used to happen. We did not previously mention the angle-bracket syntax when describing POSIX timezone specifications; add some documentation so that people can figure out what these strings do. (There's still quite a lot of undocumented functionality there, but anybody who really cares can go read the POSIX spec to find out about it. In practice most people seem to prefer Olsen-style city names anyway.)	2013-11-01 13:57:31 -04:00
Tom Lane	631dc390f4	Fix some odd behaviors when using a SQL-style simple GMT offset timezone. Formerly, when using a SQL-spec timezone setting with a fixed GMT offset (called a "brute force" timezone in the code), the session_timezone variable was not updated to match the nominal timezone; rather, all code was expected to ignore session_timezone if HasCTZSet was true. This is of course obviously fragile, though a search of the code finds only timeofday() failing to honor the rule. A bigger problem was that DetermineTimeZoneOffset() supposed that if its pg_tz parameter was pointer-equal to session_timezone, then HasCTZSet should override the parameter. This would cause datetime input containing an explicit zone name to be treated as referencing the brute-force zone instead, if the zone name happened to match the session timezone that had prevailed before installing the brute-force zone setting (as reported in bug #8572). The same malady could affect AT TIME ZONE operators. To fix, set up session_timezone so that it matches the brute-force zone specification, which we can do using the POSIX timezone definition syntax "<abbrev>offset", and get rid of the bogus lookaside check in DetermineTimeZoneOffset(). Aside from fixing the erroneous behavior in datetime parsing and AT TIME ZONE, this will cause the timeofday() function to print its result in the user-requested time zone rather than some previously-set zone. It might also affect results in third-party extensions, if there are any that make use of session_timezone without considering HasCTZSet, but in all cases the new behavior should be saner than before. Back-patch to all supported branches.	2013-11-01 12:13:18 -04:00
Robert Haas	cacbdd7810	Use appendStringInfoString instead of appendStringInfo where possible. This shaves a few cycles, and generally seems like good programming practice. David Rowley	2013-10-31 10:55:59 -04:00
Tom Lane	c2b51cf190	Improve documentation about usage of FDW validator functions. SGML documentation, as well as code comments, failed to note that an FDW's validator will be applied to foreign-table options for foreign tables using the FDW. Etsuro Fujita	2013-10-28 10:28:35 -04:00
Heikki Linnakangas	83eb54001c	Fix two bugs in setting the vm bit of empty pages. Use a critical section when setting the all-visible flag on an empty page, and WAL-logging it. log_newpage_buffer() contains an assertion that it must be called inside a critical section, and it's the right thing to do when modifying a buffer anyway. Also, the page should be marked dirty before calling log_newpage_buffer(), per the comment in log_newpage_buffer() and src/backend/access/transam/README. Patch by Andres Freund, in response to my report. Backpatch to 9.2, like the patch that introduced these bugs (`a6370fd9`).	2013-10-23 14:24:37 +03:00
Robert Haas	cab5dc5daf	Allow only some columns of a view to be auto-updateable. Previously, unless all columns were auto-updateable, we wouldn't inserts, updates, or deletes, or at least not without a rule or trigger; now, we'll allow inserts and updates that target only the auto-updateable columns, and deletes even if there are no auto-updateable columns at all provided the view definition is otherwise suitable. Dean Rasheed, reviewed by Marko Tiikkaja	2013-10-18 10:35:36 -04:00
Peter Eisentraut	5b6d08cd29	Add use of asprintf() Add asprintf(), pg_asprintf(), and psprintf() to simplify string allocation and composition. Replacement implementations taken from NetBSD. Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Asif Naeem <anaeem.it@gmail.com>	2013-10-13 00:09:18 -04:00
Kevin Grittner	f566515192	Add record_image_ops opclass for matview concurrent refresh. REFRESH MATERIALIZED VIEW CONCURRENTLY was broken for any matview containing a column of a type without a default btree operator class. It also did not produce results consistent with a non- concurrent REFRESH or a normal view if any column was of a type which allowed user-visible differences between values which compared as equal according to the type's default btree opclass. Concurrent matview refresh was modified to use the new operators to solve these problems. Documentation was added for record comparison, both for the default btree operator class for record, and the newly added operators. Regression tests now check for proper behavior both for a matview with a box column and a matview containing a citext column. Reviewed by Steve Singer, who suggested some of the doc language.	2013-10-09 14:26:09 -05:00
Robert Haas	16a906f535	Make DISCARD SEQUENCES also discard the last used sequence. Otherwise, we access already-freed memory. Oops. Report by Michael Paquier. Fix by me.	2013-10-07 15:55:56 -04:00
Robert Haas	0f1ef79095	Fix silly thinko in ResetSequenceCaches. Report from Kevin Hale Boyes.	2013-10-03 20:17:51 -04:00
Robert Haas	d90ced8bb2	Add DISCARD SEQUENCES command. DISCARD ALL will now discard cached sequence information, as well. Fabrízio de Royes Mello, reviewed by Zoltán Böszörményi, with some further tweaks by me.	2013-10-03 16:23:31 -04:00
Alvaro Herrera	15732b34e8	Add WaitForLockers in lmgr, refactoring index.c code This is in support of a future REINDEX CONCURRENTLY feature. Michael Paquier	2013-10-01 17:57:01 -03:00
Heikki Linnakangas	adaba2751f	Fix spurious warning after vacuuming a page on a table with no indexes. There is a rare race condition, when a transaction that inserted a tuple aborts while vacuum is processing the page containing the inserted tuple. Vacuum prunes the page first, which normally removes any dead tuples, but if the inserting transaction aborts right after that, the loop after pruning will see a dead tuple and remove it instead. That's OK, but if the page is on a table with no indexes, and the page becomes completely empty after removing the dead tuple (or tuples) on it, it will be immediately marked as all-visible. That's OK, but the sanity check in vacuum would throw a warning because it thinks that the page contains dead tuples and was nevertheless marked as all-visible, even though it just vacuumed away the dead tuples and so it doesn't actually contain any. Spotted this while reading the code. It's difficult to hit the race condition otherwise, but can be done by putting a breakpoint after the heap_page_prune() call. Backpatch all the way to 8.4, where this code first appeared.	2013-09-26 11:31:53 +03:00
Robert Haas	ba3d39c969	Don't allow system columns in CHECK constraints, except tableoid. Previously, arbitray system columns could be mentioned in table constraints, but they were not correctly checked at runtime, because the values weren't actually set correctly in the tuple. Since it seems easy enough to initialize the table OID properly, do that, and continue allowing that column, but disallow the rest unless and until someone figures out a way to make them work properly. No back-patch, because this doesn't seem important enough to take the risk of destabilizing the back branches. In fact, this will pose a dump-and-reload hazard for those upgrading from previous versions: constraints that were accepted before but were not correctly enforced will now either be enforced correctly or not accepted at all. Either could result in restore failures, but in practice I think very few users will notice the difference, since the use case is pretty marginal anyway and few users will be relying on features that have not historically worked. Amit Kapila, reviewed by Rushabh Lathia, with doc changes by me.	2013-09-23 13:31:22 -04:00
Alvaro Herrera	dd778e9d88	Rename various "freeze multixact" variables It seems to make more sense to use "cutoff multixact" terminology throughout the backend code; "freeze" is associated with replacing of an Xid with FrozenTransactionId, which is not what we do for MultiXactIds. Andres Freund Some adjustments by Álvaro Herrera	2013-09-16 15:47:31 -03:00
Tom Lane	0c66a22377	Update comments concerning PGC_S_TEST. This GUC context value was once only used by ALTER DATABASE SET and ALTER USER SET. That's not true anymore, though, so rewrite the comments to be a bit more general. Patch in HEAD only, since this is just an internal documentation issue.	2013-09-03 18:56:22 -04:00
Tom Lane	0d3f4406df	Allow aggregate functions to be VARIADIC. There's no inherent reason why an aggregate function can't be variadic (even VARIADIC ANY) if its transition function can handle the case. Indeed, this patch to add the feature touches none of the planner or executor, and little of the parser; the main missing stuff was DDL and pg_dump support. It is true that variadic aggregates can create the same sort of ambiguity about parameters versus ORDER BY keys that was complained of when we (briefly) had both one- and two-argument forms of string_agg(). However, the policy formed in response to that discussion only said that we'd not create any built-in aggregates with varying numbers of arguments, not that we shouldn't allow users to do it. So the logical extension of that is we can allow users to make variadic aggregates as long as we're wary about shipping any such in core. In passing, this patch allows aggregate function arguments to be named, to the extent of remembering the names in pg_proc and dumping them in pg_dump. You can't yet call an aggregate using named-parameter notation. That seems like a likely future extension, but it'll take some work, and it's not what this patch is really about. Likewise, there's still some work needed to make window functions handle VARIADIC fully, but I left that for another day. initdb forced because of new aggvariadic field in Aggref parse nodes.	2013-09-03 17:08:46 -04:00
Robert Haas	090d0f2050	Allow discovery of whether a dynamic background worker is running. Using the infrastructure provided by this patch, it's possible either to wait for the startup of a dynamically-registered background worker, or to poll the status of such a worker without waiting. In either case, the current PID of the worker process can also be obtained. As usual, worker_spi is updated to demonstrate the new functionality. Patch by me. Review by Andres Freund.	2013-08-28 14:08:13 -04:00
Kevin Grittner	28154bb23b	Remove relcache entry invalidation in REFRESH MATERIALIZED VIEW. This was added as part of the attempt to support unlogged matviews along with a populated status. It got missed when unlogged support was removed pre-commit. Noticed by Noah Misch. Back-patched to 9.3 branch.	2013-08-18 16:19:22 -05:00
Kevin Grittner	3f78b1715c	Don't allow ALTER MATERIALIZED VIEW ADD UNIQUE. Was accidentally allowed, but not documented and lacked support for rename or drop once created. Per report from Noah Misch.	2013-08-15 13:14:48 -05:00
Kevin Grittner	e2cd368678	Remove Assert that matview is not in system schema from REFRESH. We don't want to prevent an extension which creates a matview from being installed in pg_catalog. Issue was raised by Hitoshi Harada. Backpatched to 9.3.	2013-08-14 12:36:55 -05:00
Kevin Grittner	841c29c8b3	Various cleanups for REFRESH MATERIALIZED VIEW CONCURRENTLY. Open and lock each index before checking definition in RMVC. The ExclusiveLock on the related table is not viewed as sufficient to ensure that no changes are made to the index definition, and invalidation messages from other backends might have been missed. Additionally, use RelationGetIndexExpressions() and check for NIL rather than doing our own loop. Protect against redefinition of tid and rowvar operators in RMVC. While working on this, noticed that the fixes for bugs found during the CF made the UPDATE statement useless, since no rows could qualify for that treatment any more. Ripping out code to support the UPDATE statement simplified the operator cleanups. Change slightly confusing local field name. Use meaningful alias names on queries in refresh_by_match_merge(). Per concerns of raised by Andres Freund and comments and suggestions from Noah Misch. Some additional issues remain, which will be addressed separately.	2013-08-05 09:57:56 -05:00

1 2 3 4 5 ...

2748 Commits