postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	b514a7460d	Fix planning of star-schema-style queries. Part of the intent of the parameterized-path mechanism was to handle star-schema queries efficiently, but some overly-restrictive search limiting logic added in commit `e2fa76d80b` prevented such cases from working as desired. Fix that and add a regression test about it. Per gripe from Marc Cousin. This is arguably a bug rather than a new feature, so back-patch to 9.2 where parameterized paths were introduced.	2015-02-28 12:43:04 -05:00
Tom Lane	c4f4c7ca99	Improve mmgr README. Add documentation about the new reset callback mechanism. Also, at long last, recast the existing text so that it describes the current context mechanisms as established fact rather than something we're going to implement. Shoulda done that in 2001 or so ...	2015-02-27 20:32:34 -05:00
Tom Lane	d61f1a9327	Suppress uninitialized-variable warning from less-bright compilers. The type variable must get set on first iteration of the while loop, but there are reasonably modern gcc versions that don't realize that. Initialize it with a dummy value. This undoes a removal of initialization in commit `654809e770`.	2015-02-27 18:19:22 -05:00
Tom Lane	eaa5808e8e	Redefine MemoryContextReset() as deleting, not resetting, child contexts. That is, MemoryContextReset() now means what was formerly meant by MemoryContextResetAndDeleteChildren(), and the latter is now just a macro alias for the former. If you really want the functionality that was formerly provided by MemoryContextReset(), what you have to do is MemoryContextResetChildren() plus MemoryContextResetOnly() (which is a new API to reset only the named context and not touch its children). The reason for this change is that near fifteen years of experience has proven that there is noplace where old-style MemoryContextReset() is actually what you want. Making that the default behavior has led to lots of context-leakage bugs, while we've not found anyplace where it's actually necessary to keep the child contexts; at least the standard regression tests do not reveal anyplace where this change breaks anything. And there are upcoming patches that will introduce additional reasons why child contexts need to be removed. We could change existing calls of MemoryContextResetAndDeleteChildren to be just MemoryContextReset, but for the moment I'll leave them alone; they're not costing anything.	2015-02-27 18:10:04 -05:00
Alvaro Herrera	fbef4342a8	Make CREATE OR REPLACE VIEW internally more consistent The way that columns are added to a view is by calling AlterTableInternal with special subtype AT_AddColumnToView; but that subtype is changed to AT_AddColumnRecurse by ATPrepAddColumn. This has no visible effect in the current code, since views cannot have inheritance children (thus the recursion step is a no-op) and adding a column to a view is executed identically to doing it to a table; but it does make a difference for future event trigger code keeping track of commands, because the current situation leads to confusing the case with a normal ALTER TABLE ADD COLUMN. Fix the problem by passing a flag to ATPrepAddColumn to prevent it from changing the command subtype. The event trigger code can then properly ignore the subcommand. (We could remove the call to ATPrepAddColumn, since views are never typed, and there is never a need for recursion, which are the two conditions that are checked by ATPrepAddColumn; but it seems more future-proof to keep the call in place.)	2015-02-27 19:19:34 -03:00
Tom Lane	f65e827058	Invent a memory context reset/delete callback mechanism. This allows cleanup actions to be registered to be called just before a particular memory context's contents are flushed (either by deletion or MemoryContextReset). The patch in itself has no use-cases for this, but several likely reasons for wanting this exist. In passing, per discussion, rearrange some boolean fields in struct MemoryContextData so as to avoid wasted padding space. For safety, this requires making allowInCritSection's existence unconditional; but I think that's a better approach than what was there anyway.	2015-02-27 17:16:43 -05:00
Alvaro Herrera	654809e770	Fix a couple of trivial issues in jsonb.c Typo "aggreagate" appeared three times, and the return value of function JsonbIteratorNext() was being assigned to an int variable in a bunch of places.	2015-02-27 18:54:49 -03:00
Alvaro Herrera	3f190f67eb	Fix table_rewrite event trigger for ALTER TYPE/SET DATA TYPE CASCADE When a composite type being used in a typed table is modified by way of ALTER TYPE, a table rewrite occurs appearing to come from ALTER TYPE. The existing event_trigger.c code was unable to cope with that and raised a spurious error. The fix is just to accept that command tag for the event, and document this properly. Noted while fooling with deparsing of DDL commands. This appears to be an oversight in commit `618c9430a`. Thanks to Mark Wong for documentation wording help.	2015-02-27 18:39:53 -03:00
Andrew Dunstan	bda76c1c8c	Render infinite date/timestamps as 'infinity' for json/jsonb Commit `ab14a73a6c` raised an error in these cases and later the behaviour was copied to jsonb. This is what the XML code, which we then adopted, does, as the XSD types don't accept infinite values. However, json dates and timestamps are just strings as far as json is concerned, so there is no reason not to render these values as 'infinity'. The json portion of this is backpatched to 9.4 where the behaviour was introduced. The jsonb portion only affects the development branch. Per gripe on pgsql-general.	2015-02-26 12:25:21 -05:00
Andres Freund	fd6a3f3ad4	Reconsider when to wait for WAL flushes/syncrep during commit. Up to now RecordTransactionCommit() waited for WAL to be flushed (if synchronous_commit != off) and to be synchronously replicated (if enabled), even if a transaction did not have a xid assigned. The primary reason for that is that sequence's nextval() did not assign a xid, but are worthwhile to wait for on commit. This can be problematic because sometimes read only transactions do write WAL, e.g. HOT page prune records. That then could lead to read only transactions having to wait during commit. Not something people expect in a read only transaction. This lead to such strange symptoms as backends being seemingly stuck during connection establishment when all synchronous replicas are down. Especially annoying when said stuck connection is the standby trying to reconnect to allow syncrep again... This behavior also is involved in a rather complicated <= 9.4 bug where the transaction started by catchup interrupt processing waited for syncrep using latches, but didn't get the wakeup because it was already running inside the same overloaded signal handler. Fix the issue here doesn't properly solve that issue, merely papers over the problems. In 9.5 catchup interrupts aren't processed out of signal handlers anymore. To fix all this, make nextval() acquire a top level xid, and only wait for transaction commit if a transaction both acquired a xid and emitted WAL records. If only a xid has been assigned we don't uselessly want to wait just because of writes to temporary/unlogged tables; if only WAL has been written we don't want to wait just because of HOT prunes. The xid assignment in nextval() is unlikely to cause overhead in real-world workloads. For one it only happens SEQ_LOG_VALS/32 values anyway, for another only usage of nextval() without using the result in an insert or similar is affected. Discussion: 20150223165359.GF30784@awork2.anarazel.de, 369698E947874884A77849D8FE3680C2@maumau, 5CF4ABBA67674088B3941894E22A0D25@maumau Per complaint from maumau and Thom Brown Backpatch all the way back; 9.0 doesn't have syncrep, but it seems better to be consistent behavior across all maintained branches.	2015-02-26 12:50:07 +01:00
Noah Misch	f5ef00aed4	Free SQLSTATE and SQLERRM no earlier than other PL/pgSQL variables. "RETURN SQLERRM" prompted plpgsql_exec_function() to read from freed memory. Back-patch to 9.0 (all supported versions). Little code ran between the premature free and the read, so non-assert builds are unlikely to witness user-visible consequences.	2015-02-25 23:48:28 -05:00
Stephen Frost	62a4a1af5d	Add hasRowSecurity to copyfuncs/outfuncs The RLS patch added a hasRowSecurity field to PlannerGlobal and PlannedStmt but didn't update nodes/copyfuncs.c and nodes/outfuncs.c to reflect those additional fields. Correct that by adding entries to the appropriate functions for those fields. Pointed out by Robert.	2015-02-25 23:35:04 -05:00
Stephen Frost	6f9bd50eab	Add locking clause for SB views for update/delete In expand_security_qual(), we were handling locking correctly when a PlanRowMark existed, but not when we were working with the target relation (which doesn't have any PlanRowMarks, but the subquery created for the security barrier quals still needs to lock the rows under it). Noted by Etsuro Fujita when working with the Postgres FDW, which wasn't properly issuing a SELECT ... FOR UPDATE to the remote side under a DELETE. Back-patch to 9.4 where updatable security barrier views were introduced. Per discussion with Etsuro and Dean Rasheed.	2015-02-25 21:36:29 -05:00
Tom Lane	77903ede08	Fix over-optimistic caching in fetch_array_arg_replace_nulls(). When I rewrote this in commit `56a79a869b`, I forgot that it's possible for the input array type to change from one call to the next (this can happen when applying the function to pg_statistic columns, for instance). Fix that.	2015-02-25 14:19:13 -05:00
Tom Lane	e9f1c01b71	Fix dumping of views that are just VALUES(...) but have column aliases. The "simple" path for printing VALUES clauses doesn't work if we need to attach nondefault column aliases, because there's noplace to do that in the minimal VALUES() syntax. So modify get_simple_values_rte() to detect nondefault aliases and treat that as a non-simple case. This further exposes that the "non-simple" path never actually worked; it didn't produce valid syntax. Fix that too. Per bug #12789 from Curtis McEnroe, and analysis by Andrew Gierth. Back-patch to all supported branches. Before 9.3, this also requires back-patching the part of commit `092d7ded29` that created get_simple_values_rte() to begin with; inserting the extra test into the old factorization of that logic would've been too messy.	2015-02-25 12:01:12 -05:00
Michael Meskes	8794bf1ca1	Remove null-pointer checks that are not needed. If a pointer is guaranteed to carry information there is no need to check for NULL again. Patch by Michael Paquier.	2015-02-25 11:50:28 +01:00
Tom Lane	d809fd0008	Improve parser's one-extra-token lookahead mechanism. There are a couple of places in our grammar that fail to be strict LALR(1), by requiring more than a single token of lookahead to decide what to do. Up to now we've dealt with that by using a filter between the lexer and parser that merges adjacent tokens into one in the places where two tokens of lookahead are necessary. But that creates a number of user-visible anomalies, for instance that you can't name a CTE "ordinality" because "WITH ordinality AS ..." triggers folding of WITH and ORDINALITY into one token. I realized that there's a better way. In this patch, we still do the lookahead basically as before, but we never merge the second token into the first; we replace just the first token by a special lookahead symbol when one of the lookahead pairs is seen. This requires a couple extra productions in the grammar, but it involves fewer special tokens, so that the grammar tables come out a bit smaller than before. The filter logic is no slower than before, perhaps a bit faster. I also fixed the filter logic so that when backing up after a lookahead, the current token's terminator is correctly restored; this eliminates some weird behavior in error message issuance, as is shown by the one change in existing regression test outputs. I believe that this patch entirely eliminates odd behaviors caused by lookahead for WITH. It doesn't really improve the situation for NULLS followed by FIRST/LAST unfortunately: those sequences still act like a reserved word, even though there are cases where they should be seen as two ordinary identifiers, eg "SELECT nulls first FROM ...". I experimented with additional grammar hacks but couldn't find any simple solution for that. Still, this is better than before, and it seems much more likely that we could somehow solve the NULLS case on the basis of this filter behavior than the previous one.	2015-02-24 17:53:45 -05:00
Peter Eisentraut	23a78352c0	Error when creating names too long for tar format The tar format (at least the version we are using), does not support file names or symlink targets longer than 99 bytes. Until now, the tar creation code would silently truncate any names that are too long. (Its original application was pg_dump, where this never happens.) This creates problems when running base backups over the replication protocol. The most important problem is when a tablespace path is longer than 99 bytes, which will result in a truncated tablespace path being backed up. Less importantly, the basebackup protocol also promises to back up any other files it happens to find in the data directory, which would also lead to file name truncation if someone put a file with a long name in there. Now both of these cases result in an error during the backup. Add tests that fail when a too-long file name or symlink is attempted to be backed up. Reviewed-by: Robert Hass <robertmhaas@gmail.com>	2015-02-24 13:41:07 -05:00
Heikki Linnakangas	dd58c6098f	Fix typo in README. Kyotaro Horiguchi	2015-02-24 14:33:26 +02:00
Alvaro Herrera	d1712d01d0	Fix stupid merge errors in previous commit Brown paper bag installed permanently.	2015-02-23 15:05:37 -03:00
Tom Lane	56be925e4b	Further tweaking of raw grammar output to distinguish different inputs. Use a different A_Expr_Kind for LIKE/ILIKE/SIMILAR TO constructs, so that they can be distinguished from direct invocation of the underlying operators. Also, postpone selection of the operator name when transforming "x IN (select)" to "x = ANY (select)", so that those syntaxes can be told apart at parse analysis time. I had originally thought I'd also have to do something special for the syntaxes IS NOT DISTINCT FROM, IS NOT DOCUMENT, and x NOT IN (SELECT...), which the grammar translates as though they were NOT (construct). On reflection though, we can distinguish those cases reliably by noting whether the parse location shown for the NOT is the same as for its child node. This only requires tweaking the parse locations for NOT IN, which I've done here. These changes should have no effect outside the parser; they're just in support of being able to give accurate warnings for planned operator precedence changes.	2015-02-23 12:46:50 -05:00
Alvaro Herrera	296f3a6053	Support more commands in event triggers COMMENT, SECURITY LABEL, and GRANT/REVOKE now also fire ddl_command_start and ddl_command_end event triggers, when they operate on database-local objects. Reviewed-By: Michael Paquier, Andres Freund, Stephen Frost	2015-02-23 14:22:42 -03:00
Heikki Linnakangas	88e9823026	Replace checkpoint_segments with min_wal_size and max_wal_size. Instead of having a single knob (checkpoint_segments) that both triggers checkpoints, and determines how many checkpoints to recycle, they are now separate concerns. There is still an internal variable called CheckpointSegments, which triggers checkpoints. But it no longer determines how many segments to recycle at a checkpoint. That is now auto-tuned by keeping a moving average of the distance between checkpoints (in bytes), and trying to keep that many segments in reserve. The advantage of this is that you can set max_wal_size very high, but the system won't actually consume that much space if there isn't any need for it. The min_wal_size sets a floor for that; you can effectively disable the auto-tuning behavior by setting min_wal_size equal to max_wal_size. The max_wal_size setting is now the actual target size of WAL at which a new checkpoint is triggered, instead of the distance between checkpoints. Previously, you could calculate the actual WAL usage with the formula "(2 + checkpoint_completion_target) * checkpoint_segments + 1". With this patch, you set the desired WAL usage with max_wal_size, and the system calculates the appropriate CheckpointSegments with the reverse of that formula. That's a lot more intuitive for administrators to set. Reviewed by Amit Kapila and Venkata Balaji N.	2015-02-23 18:53:02 +02:00
Heikki Linnakangas	0fec000365	Renumber GUC_* constants. This moves all the regular flags back together (for aesthetic reasons), and makes room for more GUC_UNIT_* types.	2015-02-23 18:33:16 +02:00
Heikki Linnakangas	1b63026473	Refactor unit conversions code in guc.c. Replace the if-switch-case constructs with two conversion tables, containing all the supported conversions between human-readable unit strings and the base units used in GUC variables. This makes the code easier to read, and makes adding new units simpler.	2015-02-23 18:06:16 +02:00
Andres Freund	bc208a5a2f	Guard against spurious signals in LockBufferForCleanup. When LockBufferForCleanup() has to wait for getting a cleanup lock on a buffer it does so by setting a flag in the buffer header and then wait for other backends to signal it using ProcWaitForSignal(). Unfortunately LockBufferForCleanup() missed that ProcWaitForSignal() can return for other reasons than the signal it is hoping for. If such a spurious signal arrives the wait flags on the buffer header will still be set. That then triggers "ERROR: multiple backends attempting to wait for pincount 1". The fix is simple, unset the flag if still set when retrying. That implies an additional spinlock acquisition/release, but that's unlikely to matter given the cost of waiting for a cleanup lock. Alternatively it'd have been possible to move responsibility for maintaining the relevant flag to the waiter all together, but that might have had negative consequences due to possible floods of signals. Besides being more invasive. This looks to be a very longstanding bug. The relevant code in LockBufferForCleanup() hasn't changed materially since its introduction and ProcWaitForSignal() was documented to return for unrelated reasons since 8.2. The master only patch series removing ImmediateInterruptOK made it much easier to hit though, as ProcSendSignal/ProcWaitForSignal now uses a latch shared with other tasks. Per discussion with Kevin Grittner, Tom Lane and me. Backpatch to all supported branches. Discussion: 11553.1423805224@sss.pgh.pa.us	2015-02-23 16:14:14 +01:00
Fujii Masao	5d2b45e3f7	Add GUC to control the time to wait before retrieving WAL after failed attempt. Previously when the standby server failed to retrieve WAL files from any sources (i.e., streaming replication, local pg_xlog directory or WAL archive), it always waited for five seconds (hard-coded) before the next attempt. For example, this is problematic in warm-standby because restore_command can fail every five seconds even while new WAL file is expected to be unavailable for a long time and flood the log files with its error messages. This commit adds new parameter, wal_retrieve_retry_interval, to control that wait time. Alexey Vasiliev and Michael Paquier, reviewed by Andres Freund and me.	2015-02-23 20:55:17 +09:00
Heikki Linnakangas	2a3f6e368b	Fix potential deadlock with libpq non-blocking mode. If libpq output buffer is full, pqSendSome() function tries to drain any incoming data. This avoids deadlock, if the server e.g. sends a lot of NOTICE messages, and blocks until we read them. However, pqSendSome() only did that in blocking mode. In non-blocking mode, the deadlock could still happen. To fix, take a two-pronged approach: 1. Change the documentation to instruct that when PQflush() returns 1, you should wait for both read- and write-ready, and call PQconsumeInput() if it becomes read-ready. That fixes the deadlock, but applications are not going to change overnight. 2. In pqSendSome(), drain the input buffer before returning 1. This alleviates the problem for applications that only wait for write-ready. In particular, a slow but steady stream of NOTICE messages during COPY FROM STDIN will no longer cause a deadlock. The risk remains that the server attempts to send a large burst of data and fills its output buffer, and at the same time the client also sends enough data to fill its output buffer. The application will deadlock if it goes to sleep, waiting for the socket to become write-ready, before the server's data arrives. In practice, NOTICE messages and such that the server might be sending are usually short, so it's highly unlikely that the server would fill its output buffer so quickly. Backpatch to all supported versions.	2015-02-23 13:34:21 +02:00
Tom Lane	c063da1769	Add parse location fields to NullTest and BooleanTest structs. We did not need a location tag on NullTest or BooleanTest before, because no error messages referred directly to their locations. That's planned to change though, so add these fields in a separate housekeeping commit. Catversion bump because stored rules may change.	2015-02-22 14:40:27 -05:00
Tom Lane	6a75562ed1	Get rid of multiple applications of transformExpr() to the same tree. transformExpr() has for many years had provisions to do nothing when applied to an already-transformed expression tree. However, this was always ugly and of dubious reliability, so we'd be much better off without it. The primary historical reason for it was that gram.y sometimes returned multiple links to the same subexpression, which is no longer true as of my BETWEEN fixes. We'd also grown some lazy hacks in CREATE TABLE LIKE (failing to distinguish between raw and already-transformed index specifications) and one or two other places. This patch removes the need for and support for re-transforming already transformed expressions. The index case is dealt with by adding a flag to struct IndexStmt to indicate that it's already been transformed; which has some benefit anyway in that tablecmds.c can now Assert that transformation has happened rather than just assuming. The other main reason was some rather sloppy code for array type coercion, which can be fixed (and its performance improved too) by refactoring. I did leave transformJoinUsingClause() still constructing expressions containing untransformed operator nodes being applied to Vars, so that transformExpr() still has to allow Var inputs. But that's a much narrower, and safer, special case than before, since Vars will never appear in a raw parse tree, and they don't have any substructure to worry about. In passing fix some oversights in the patch that added CREATE INDEX IF NOT EXISTS (missing processing of IndexStmt.if_not_exists). These appear relatively harmless, but still sloppy coding practice.	2015-02-22 13:59:09 -05:00
Tom Lane	34af082f95	Represent BETWEEN as a special node type in raw parse trees. Previously, gram.y itself converted BETWEEN into AND (or AND/OR) nests of expression comparisons. This was always as bogus as could be, but fixing it hasn't risen to the top of the to-do list. The present patch invents an A_Expr representation for BETWEEN expressions, and does the expansion to comparison trees in parse_expr.c which is at least a slightly saner place to be doing semantic conversions. There should be no change in the post- parse-analysis results. This does nothing for the semantic issues with BETWEEN (dubious connection to btree-opclass semantics, and multiple evaluation of possibly volatile subexpressions) ... but it's a necessary preliminary step before we could fix any of that. The main immediate benefit is that preserving BETWEEN as an identifiable raw-parse-tree construct will enable better error messages. While at it, fix the code so that multiply-referenced subexpressions are physically duplicated before being passed through transformExpr(). This gets rid of one of the principal reasons why transformExpr() has historically had to allow already-processed input.	2015-02-22 13:57:56 -05:00
Jeff Davis	74811c4050	Rename variable in AllocSetContextCreate to be consistent. Everywhere else in the file, "context" is of type MemoryContext and "set" is of type AllocSet. AllocSetContextCreate uses a variable of type AllocSet, so rename it from "context" to "set".	2015-02-21 23:17:52 -08:00
Jeff Davis	b419865a81	In array_agg(), don't create a new context for every group. Previously, each new array created a new memory context that started out at 8kB. This is incredibly wasteful when there are lots of small groups of just a few elements each. Change initArrayResult() and friends to accept a "subcontext" argument to indicate whether the caller wants the ArrayBuildState allocated in a new subcontext or not. If not, it can no longer be released separately from the rest of the memory context. Fixes bug report by Frank van Vugt on 2013-10-19. Tomas Vondra. Reviewed by Ali Akbar, Tom Lane, and me.	2015-02-21 17:24:48 -08:00
Tom Lane	e9fd5545de	Try to fix busted gettimeofday() code. Per buildfarm, we have to match the _stdcall property of the system functions.	2015-02-21 17:15:13 -05:00
Tom Lane	332f02f88b	Use FLEXIBLE_ARRAY_MEMBER in Windows-specific code. Be a tad more paranoid about overlength input, too.	2015-02-21 16:49:35 -05:00
Andres Freund	82a532b34d	Force some system catalog table columns to be marked NOT NULL. In a manual pass over the catalog declaration I found a number of columns which the boostrap automatism didn't mark NOT NULL even though they actually were. Add BKI_FORCE_NOT_NULL markings to them. It's usually not critical if a system table column is falsely determined to be nullable as the code should always catch relevant cases. But it's good to have a extra layer in place. Discussion: 20150215170014.GE15326@awork2.anarazel.de	2015-02-21 22:37:05 +01:00
Andres Freund	eb68379c38	Allow forcing nullness of columns during bootstrap. Bootstrap determines whether a column is null based on simple builtin rules. Those work surprisingly well, but nonetheless a few existing columns aren't set correctly. Additionally there is at least one patch sent to hackers where forcing the nullness of a column would be helpful. The boostrap format has gained FORCE [NOT] NULL for this, which will be emitted by genbki.pl when BKI_FORCE_(NOT_)?NULL is specified for a column in a catalog header. This patch doesn't change the marking of any existing columns. Discussion: 20150215170014.GE15326@awork2.anarazel.de	2015-02-21 22:31:54 +01:00
Tom Lane	2e211211a7	Use FLEXIBLE_ARRAY_MEMBER in a number of other places. I think we're about done with this...	2015-02-21 16:12:14 -05:00
Tom Lane	e1a11d9311	Use FLEXIBLE_ARRAY_MEMBER for HeapTupleHeaderData.t_bits[]. This requires changing quite a few places that were depending on sizeof(HeapTupleHeaderData), but it seems for the best. Michael Paquier, some adjustments by me	2015-02-21 15:13:06 -05:00
Tom Lane	3d9b6f31ee	Minor code beautification in conninfo_uri_parse_params(). Reading this made me itch, so clean the logic a bit.	2015-02-21 13:27:12 -05:00
Tom Lane	b26e208142	Fix misparsing of empty value in conninfo_uri_parse_params(). After finding an "=" character, the pointer was advanced twice when it should only advance once. This is harmless as long as the value after "=" has at least one character; but if it doesn't, we'd miss the terminator character and include too much in the value. In principle this could lead to reading off the end of memory. It does not seem worth treating as a security issue though, because it would happen on client side, and besides client logic that's taking conninfo strings from untrusted sources has much worse security problems than this. Report and patch received off-list from Thomas Fanghaenel. Back-patch to 9.2 where the faulty code was introduced.	2015-02-21 12:59:54 -05:00
Robert Haas	64235fecc6	Don't require users of src/port/gettimeofday.c to initialize it. Commit `8001fe67a3` introduced this requirement, but per discussion, we want to avoid requirements of this type to make things easier on the calling code. An especially important consideration is that this may be used in frontend code, not just the backend. Asif Naeem, reviewed by Michael Paquier	2015-02-21 12:17:04 -05:00
Tom Lane	f2874feb7c	Some more FLEXIBLE_ARRAY_MEMBER fixes.	2015-02-21 01:46:43 -05:00
Tom Lane	33b2a2c97f	Fix statically allocated struct with FLEXIBLE_ARRAY_MEMBER member. clang complains about this, not unreasonably, so define another struct that's explicitly for a WordEntryPos with exactly one element. While at it, get rid of pretty dubious use of a static variable for more than one purpose --- if it were being treated as const maybe I'd be okay with this, but it isn't.	2015-02-20 17:50:18 -05:00
Tom Lane	33a3b03d63	Use FLEXIBLE_ARRAY_MEMBER in some more places. Fix a batch of structs that are only visible within individual .c files. Michael Paquier	2015-02-20 17:32:01 -05:00
Tom Lane	c110eff132	Use FLEXIBLE_ARRAY_MEMBER in struct RecordIOData. I (tgl) fixed this last night in rowtypes.c, but I missed that the code had been copied into a couple of other places. Michael Paquier	2015-02-20 17:03:12 -05:00
Tom Lane	e38b1eb098	Use FLEXIBLE_ARRAY_MEMBER in struct varlena. This forces some minor coding adjustments in tuptoaster.c and inv_api.c, but the new coding there is cleaner anyway. Michael Paquier	2015-02-20 16:51:53 -05:00
Alvaro Herrera	8902f79264	Remove unnecessary and unreliable test	2015-02-20 14:03:49 -03:00
Alvaro Herrera	3b14bb7771	Update PGSTAT_FILE_FORMAT_ID Previous commit should have bumped it but didn't. Oops. Per note from Tom.	2015-02-20 12:59:27 -03:00
Alvaro Herrera	d42358efb1	Have TRUNCATE update pgstat tuple counters This works by keeping a per-subtransaction record of the ins/upd/del counters before the truncate, and then resetting them; this record is useful to return to the previous state in case the truncate is rolled back, either in a subtransaction or whole transaction. The state is propagated upwards as subtransactions commit. When the per-table data is sent to the stats collector, a flag indicates to reset the live/dead counters to zero as well. Catalog version bumped due to the change in pgstat format. Author: Alexander Shulgin Discussion: 1007.1207238291@sss.pgh.pa.us Discussion: 548F7D38.2000401@BlueTreble.com Reviewed-by: Álvaro Herrera, Jim Nasby	2015-02-20 12:10:01 -03:00
Tom Lane	5740be6d6e	Some more FLEXIBLE_ARRAY_MEMBER hacking.	2015-02-20 02:28:03 -05:00
Tom Lane	9aa53bbd15	Remove unused variable. Per buildfarm.	2015-02-20 00:47:28 -05:00
Tom Lane	692bd09ad1	Use "#ifdef CATALOG_VARLEN" to protect nullable fields of pg_authid. This gives a stronger guarantee than a mere comment against accessing these fields as simple struct members. Since rolpassword is in fact varlena, it's not clear why these didn't get marked from the beginning, but let's do it now. Michael Paquier	2015-02-20 00:23:48 -05:00
Tom Lane	09d8d110a6	Use FLEXIBLE_ARRAY_MEMBER in a bunch more places. Replace some bogus "x[1]" declarations with "x[FLEXIBLE_ARRAY_MEMBER]". Aside from being more self-documenting, this should help prevent bogus warnings from static code analyzers and perhaps compiler misoptimizations. This patch is just a down payment on eliminating the whole problem, but it gets rid of a lot of easy-to-fix cases. Note that the main problem with doing this is that one must no longer rely on computing sizeof(the containing struct), since the result would be compiler-dependent. Instead use offsetof(struct, lastfield). Autoconf also warns against spelling that offsetof(struct, lastfield[0]). Michael Paquier, review and additional fixes by me.	2015-02-20 00:11:42 -05:00
Tom Lane	2fb7a75f37	Add pg_stat_get_snapshot_timestamp() to show statistics snapshot timestamp. Per discussion, this could be useful for purposes such as programmatically detecting a nonresponding stats collector. We already have the timestamp anyway, it's just a matter of providing a SQL-accessible function to fetch it. Matt Kelly, reviewed by Jim Nasby	2015-02-19 21:36:50 -05:00
Heikki Linnakangas	634618ecd0	Remove dead structs. These are not used with the new WAL format anymore. GIN split records are simply always recorded as full-page images. Michael Paquier	2015-02-19 21:14:37 +02:00
Tom Lane	56a79a869b	Split array_push into separate array_append and array_prepend functions. There wasn't any good reason for a single C function to implement both these SQL functions: it saved very little code overall, and it required significant pushups to re-determine at runtime which case applied. Redoing it as two functions ends up with just slightly more lines of code, but it's simpler to understand, and faster too because we need not repeat syscache lookups on every call. An important side benefit is that this eliminates the only case in which different aliases of the same C function had both anyarray and anyelement arguments at the same position, which would almost always be a mistake. The opr_sanity regression test will now notice such mistakes since there's no longer a valid case where it happens.	2015-02-18 20:53:33 -05:00
Peter Eisentraut	d30292b8c4	Fix Perl coding error in msvc build system Code like open(P, "cl /? 2>&1 \|") \|\| die "cl command not found"; does not actually catch any errors, because the exit status of the command before the pipe is ignored. The fix is to look at $?. This also gave the opportunity to clean up the logic of this code a bit.	2015-02-18 20:24:30 -05:00
Alvaro Herrera	9c7dd35019	Fix opclass/opfamily identity strings The original representation uses "opcname for amname", which is good enough; but if we replace "for" with "using", we can apply the returned identity directly in a DROP command, as in DROP OPERATOR CLASS opcname USING amname This slightly simplifies code using object identities to programatically execute commands on these kinds of objects. Note backwards-incompatible change: The previous representation dates back to 9.3 when object identities were introduced by commit `f8348ea3`, but we don't want to change the behavior on released branches unnecessarily and so this is not backpatched.	2015-02-18 14:44:27 -03:00
Alvaro Herrera	0d906798f6	Fix object identities for pg_conversion objects We were neglecting to schema-qualify them. Backpatch to 9.3, where object identities were introduced as a concept by commit `f8348ea32e`.	2015-02-18 14:28:11 -03:00
Tom Lane	297b2c1ef9	Fix placement of "SET row_security" command issuance in pg_dump. Somebody apparently threw darts at the code to decide where to insert these. They certainly didn't proceed by adding them where other similar SETs were handled. This at least broke pg_restore, and perhaps other use-cases too.	2015-02-18 12:23:40 -05:00
Tom Lane	0e7e355f27	Fix failure to honor -Z compression level option in pg_dump -Fd. cfopen() and cfopen_write() failed to pass the compression level through to zlib, so that you always got the default compression level if you got any at all. In passing, also fix these and related functions so that the correct errno is reliably returned on failure; the original coding supposes that free() cannot change errno, which is untrue on at least some platforms. Per bug #12779 from Christoph Berg. Back-patch to 9.1 where the faulty code was introduced. Michael Paquier	2015-02-18 11:43:00 -05:00
Tom Lane	abe45a9b31	Fix EXPLAIN output for cases where parent table is excluded by constraints. The previous coding in EXPLAIN always labeled a ModifyTable node with the name of the target table affected by its first child plan. When originally written, this was necessarily the parent table of the inheritance tree, so everything was unconfusing. But when we added NO INHERIT constraints, it became possible for the parent table to be deleted from the plan by constraint exclusion while still leaving child tables present. This led to the ModifyTable plan node being labeled with the first surviving child, which was deemed confusing. Fix it by retaining the parent table's RT index in a new field in ModifyTable. Etsuro Fujita, reviewed by Ashutosh Bapat and myself	2015-02-17 18:04:11 -05:00
Heikki Linnakangas	931bf3eb9b	Fix a bug in pairing heap removal code. After removal, the next_sibling pointer of a node was sometimes incorrectly left to point to another node in the heap, which meant that a node was sometimes linked twice into the heap. Surprisingly that didn't cause any crashes in my testing, but it was clearly wrong and could easily segfault in other scenarios. Also always keep the prev_or_parent pointer as NULL on the root node. That was not a correctness issue AFAICS, but let's be tidy. Add a debugging function, to dump the contents of a pairing heap as a string. It's #ifdef'd out, as it's not used for anything in any normal code, but it was highly useful in debugging this. Let's keep it handy for further reference.	2015-02-17 22:55:53 +02:00
Heikki Linnakangas	d17b6df239	Fix knn-GiST queue comparison function to return heap tuples first. The part of the comparison function that was supposed to keep heap tuples ahead of index items was backwards. It would not lead to incorrect results, but it is more efficient to return heap tuples first, before scanning more index pages, when both have the same distance. Alexander Korotkov	2015-02-17 22:33:38 +02:00
Tom Lane	2e105def09	Remove code to match IPv4 pg_hba.conf entries to IPv4-in-IPv6 addresses. In investigating yesterday's crash report from Hugo Osvaldo Barrera, I only looked back as far as commit `f3aec2c7f5` where the breakage occurred (which is why I thought the IPv4-in-IPv6 business was undocumented). But actually the logic dates back to commit `3c9bb8886d` and was simply broken by erroneous refactoring in the later commit. A bit of archives excavation shows that we added the whole business in response to a report that some 2003-era Linux kernels would report IPv4 connections as having IPv4-in-IPv6 addresses. The fact that we've had no complaints since 9.0 seems to be sufficient confirmation that no modern kernels do that, so let's just rip it all out rather than trying to fix it. Do this in the back branches too, thus essentially deciding that our effective behavior since 9.0 is correct. If there are any platforms on which the kernel reports IPv4-in-IPv6 addresses as such, yesterday's fix would have made for a subtle and potentially security-sensitive change in the effective meaning of IPv4 pg_hba.conf entries, which does not seem like a good thing to do in minor releases. So let's let the post-9.0 behavior stand, and change the documentation to match it. In passing, I failed to resist the temptation to wordsmith the description of pg_hba.conf IPv4 and IPv6 address entries a bit. A lot of this text hasn't been touched since we were IPv4-only.	2015-02-17 12:49:18 -05:00
Robert Haas	5d6c2405f4	Improve pg_check_dir code and comments. Avoid losing errno if readdir() fails and closedir() works. Consistently return 4 rather than 3 if both a lost+found directory and other files are found, rather than returning one value or the other depending on the order of the directory listing. Update comments to match the actual behavior. These oversights date to commits `6f03927fce` and `17f1523932`. Marco Nenciarini	2015-02-17 10:19:30 -05:00
Tom Lane	cb66f495f5	Fix misuse of memcpy() in check_ip(). The previous coding copied garbage into a local variable, pretty much ensuring that the intended test of an IPv6 connection address against a promoted IPv4 address from pg_hba.conf would never match. The lack of field complaints likely indicates that nobody realized this was supposed to work, which is unsurprising considering that no user-facing docs suggest it should work. In principle this could have led to a SIGSEGV due to reading off the end of memory, but since the source address would have pointed to somewhere in the function's stack frame, that's quite unlikely. What led to discovery of the bug is Hugo Osvaldo Barrera's report of a crash after an OS upgrade, which is probably because he is now running a system in which memcpy raises abort() upon detecting overlapping source and destination areas. (You'd have to additionally suppose some things about the stack frame layout to arrive at this conclusion, but it seems plausible.) This has been broken since the code was added, in commit `f3aec2c7f5`, so back-patch to all supported branches.	2015-02-16 16:18:31 -05:00
Heikki Linnakangas	c478959a00	Fix comment in libpq OpenSSL code about why a substitue BIO is used. The comment was copy-pasted from the backend code along with the implementation, but libpq has different reasons for using the BIO.	2015-02-16 23:05:20 +02:00
Heikki Linnakangas	1c2b7c0879	Restore the SSL_set_session_id_context() call to OpenSSL renegotiation. This reverts the removal of the call in commit (`272923a0`). It turns out it wasn't superfluous after all: without it, renegotiation fails if a client certificate was used. The rest of the changes in that commit are still OK and not reverted. Per investigation of bug #12769 by Arne Scheffer, although this doesn't fix the reported bug yet.	2015-02-16 22:34:32 +02:00
Tom Lane	9e3ad1aac5	Use fast path in plpgsql's RETURN/RETURN NEXT in more cases. exec_stmt_return() and exec_stmt_return_next() have fast-path code for handling a simple variable reference (i.e. "return var") without going through the full expression evaluation machinery. For some reason, pl_gram.y was under the impression that this fast path only applied for record/row variables; but in reality code for handling regular scalar variables has been there all along. Adjusting the logic to allow that code to be used actually results in a net savings of code in pl_gram.y (by eliminating some redundancy), and it buys a measurable though not very impressive amount of speedup. Noted while fooling with my expanded-array patch, wherein this makes a much bigger difference because it enables returning an expanded array variable without an extra flattening step. But AFAICS this is a win regardless, so commit it separately.	2015-02-16 15:28:48 -05:00
Heikki Linnakangas	2c75531a6c	In the SSL test suite, use a root CA cert that won't expire (so quickly) All the other certificates were created to be valid for 10000 days, because we don't want to have to recreate them. But I missed the root CA cert, and the pre-created certificates included in the repository expired in January. Fix, and re-create all the certificates.	2015-02-16 22:11:43 +02:00
Tom Lane	e983c4d1aa	Rationalize the APIs of array element/slice access functions. The four functions array_ref, array_set, array_get_slice, array_set_slice have traditionally declared their array inputs and results as being of type "ArrayType ". This is a lie, and has been since Berkeley days, because they actually also support "fixed-length array" types such as "name" and "point"; not to mention that the inputs could be toasted. These values should be declared Datum instead to avoid confusion. The current coding already risks possible misoptimization by compilers, and it'll get worse when "expanded" array representations become a valid alternative. However, there's a fair amount of code using array_ref and array_set with arrays that are* known to be ArrayType structures, and there might be more such places in third-party code. Rather than cluttering those call sites with PointerGetDatum/DatumGetArrayTypeP cruft, what I did was to rename the existing functions to array_get_element/array_set_element, fix their signatures, then reincarnate array_ref/array_set as backwards compatibility wrappers. array_get_slice/array_set_slice have no such constituency in the core code, and probably not in third-party code either, so I just changed their APIs.	2015-02-16 12:23:58 -05:00
Tom Lane	08361cea2b	Fix null-pointer-deref crash while doing COPY IN with check constraints. In commit `bf7ca15875` I introduced an assumption that an RTE referenced by a whole-row Var must have a valid eref field. This is false for RTEs constructed by DoCopy, and there are other places taking similar shortcuts. Perhaps we should make all those places go through addRangeTableEntryForRelation or its siblings instead of having ad-hoc logic, but the most reliable fix seems to be to make the new code in ExecEvalWholeRowVar cope if there's no eref. We can reasonably assume that there's no need to insert column aliases if no aliases were provided. Add a regression test case covering this, and also verifying that a sane column name is in fact available in this situation. Although the known case only crashes in 9.4 and HEAD, it seems prudent to back-patch the code change to 9.2, since all the ingredients for a similar failure exist in the variant patch applied to 9.3 and 9.2. Per report from Jean-Pierre Pelletier.	2015-02-15 23:26:45 -05:00
Peter Eisentraut	64cdbbc48c	pg_regress: Write processed input/*.source into output dir Before, it was writing the processed files into the input directory, which is incorrect in a vpath build.	2015-02-14 21:33:41 -05:00
Heikki Linnakangas	33e879c4e9	Fix broken #ifdef for __sparcv8 Rob Rowan. Backpatch to all supported versions, like the patch that added the broken #ifdef.	2015-02-13 23:56:25 +02:00
Heikki Linnakangas	80788a431e	Simplify waiting logic in reading from / writing to client. The client socket is always in non-blocking mode, and if we actually want blocking behaviour, we emulate it by sleeping and retrying. But we have retry loops at different layers for reads and writes, which was confusing. To simplify, remove all the sleeping and retrying code from the lower levels, from be_tls_read and secure_raw_read and secure_raw_write, and put all the logic in secure_read() and secure_write().	2015-02-13 21:46:14 +02:00
Heikki Linnakangas	272923a0a6	Simplify the way OpenSSL renegotiation is initiated in server. At least in all modern versions of OpenSSL, it is enough to call SSL_renegotiate() once, and then forget about it. Subsequent SSL_write() and SSL_read() calls will finish the handshake. The SSL_set_session_id_context() call is unnecessary too. We only have one SSL context, and the SSL session was created with that to begin with.	2015-02-13 21:46:08 +02:00
Bruce Momjian	866f3017a8	pg_upgrade: preserve freeze info for postgres/template1 dbs pg_database.datfrozenxid and pg_database.datminmxid were not preserved for the 'postgres' and 'template1' databases. This could cause missing clog file errors on access to user tables and indexes after upgrades in these databases. Backpatch through 9.0	2015-02-11 21:02:44 -05:00
Tom Lane	4f38a281a3	Fix missing PQclear() in libpqrcv_endstreaming(). This omission leaked one PGresult per WAL streaming cycle, which possibly would never be enough to notice in the real world, but it's still a leak. Per Coverity. Back-patch to 9.3 where the error was introduced.	2015-02-11 19:20:49 -05:00
Tom Lane	58146d35de	Fix minor memory leak in ident_inet(). We'd leak the ident_serv data structure if the second pg_getaddrinfo_all (the one for the local address) failed. This is not of great consequence because a failure return here just leads directly to backend exit(), but if this function is going to try to clean up after itself at all, it should not have such holes in the logic. Try to fix it in a future-proof way by having all the failure exits go through the same cleanup path, rather than "optimizing" some of them. Per Coverity. Back-patch to 9.2, which is as far back as this patch applies cleanly.	2015-02-11 19:09:54 -05:00
Tom Lane	9179444d07	Fix more memory leaks in failure path in buildACLCommands. We already had one go at this issue in commit `d73b7f973d`, but we failed to notice that buildACLCommands also leaked several PQExpBuffers along with a simply malloc'd string. This time let's try to make the fix a bit more future-proof by eliminating the separate exit path. It's still not exactly critical because pg_dump will curl up and die on failure; but since the amount of the potential leak is now several KB, it seems worth back-patching as far as 9.2 where the previous fix landed. Per Coverity, which evidently is smarter than clang's static analyzer.	2015-02-11 18:35:23 -05:00
Tom Lane	9feefedf9e	Fix pg_dump's heuristic for deciding which casts to dump. Back in 2003 we had a discussion about how to decide which casts to dump. At the time pg_dump really only considered an object's containing schema to decide what to dump (ie, dump whatever's not in pg_catalog), and so we chose a complicated idea involving whether the underlying types were to be dumped (cf commit `a6790ce857`). But users are allowed to create casts between built-in types, and we failed to dump such casts. Let's get rid of that heuristic, which has accreted even more ugliness since then, in favor of just looking at the cast's OID to decide if it's a built-in cast or not. In passing, also fix some really ancient code that supposed that it had to manufacture a dependency for the cast on its cast function; that's only true when dumping from a pre-7.3 server. This just resulted in some wasted cycles and duplicate dependency-list entries with newer servers, but we might as well improve it. Per gripes from a number of people, most recently Greg Sabino Mullane. Back-patch to all supported branches.	2015-02-10 22:38:15 -05:00
Tom Lane	1a179f36f7	Fix GEQO to not assume its join order heuristic always works. Back in commit `400e2c9344` I rewrote GEQO's gimme_tree function to improve its heuristic for modifying the given tour into a legal join order. In what can only be called a fit of hubris, I supposed that this new heuristic would always find a legal join order, and ripped out the old logic that allowed gimme_tree to sometimes fail. The folly of this is exposed by bug #12760, in which the "greedy" clumping behavior of merge_clump() can lead it into a dead end which could only be recovered from by un-clumping. We have no code for that and wouldn't know exactly what to do with it if we did. Rather than try to improve the heuristic rules still further, let's just recognize that it is a heuristic and probably must always have failure cases. So, put back the code removed in the previous commit to allow for failure (but comment it a bit better this time). It's possible that this code was actually fully correct at the time and has only been broken by the introduction of LATERAL. But having seen this example I no longer have much faith in that proposition, so back-patch to all supported branches.	2015-02-10 20:37:19 -05:00
Michael Meskes	1f393fc923	Fixed array handling in ecpg. When ecpg was rewritten to the new protocol version not all variable types were corrected. This patch rewrites the code for these types to fix that. It also fixes the documentation to correctly tell the status of array handling.	2015-02-10 12:04:10 +01:00
Heikki Linnakangas	025c02420d	Speed up CRC calculation using slicing-by-8 algorithm. This speeds up WAL generation and replay. The new algorithm is significantly faster with large inputs, like full-page images or when inserting wide rows. It is slower with tiny inputs, i.e. less than 10 bytes or so, but the speedup with longer inputs more than make up for that. Even small WAL records at least have 24 byte header in the front. The output is identical to the current byte-at-a-time computation, so this does not affect compatibility. The new algorithm is only used for the CRC-32C variant, not the legacy version used in tsquery or the "traditional" CRC-32 used in hstore and ltree. Those are not as performance critical, and are usually only applied over small inputs, so it seems better to not carry around the extra lookup tables to speed up those rare cases. Abhijit Menon-Sen	2015-02-10 10:54:40 +02:00
Heikki Linnakangas	cc761b170c	Fix MSVC build. When I moved pg_crc.c from src/port to src/common, I forgot to modify MSVC build script accordingly.	2015-02-09 22:13:50 +02:00
Tom Lane	bc4de01db3	Minor cleanup/code review for "indirect toast" stuff. Fix some issues I noticed while fooling with an extension to allow an additional kind of toast pointer. Much of this is just comment improvement, but there are a couple of actual bugs, which might or might not be reachable today depending on what can happen during logical decoding. An example is that toast_flatten_tuple() failed to cover the possibility of an indirection pointer in its input. Back-patch to 9.4 just in case that is reachable now. In HEAD, also correct some really minor issues with recent compression reorganization, such as dangerously underparenthesized macros.	2015-02-09 12:30:52 -05:00
Heikki Linnakangas	c619c2351f	Move pg_crc.c to src/common, and remove pg_crc_tables.h To get CRC functionality in a client program, you now need to link with libpgcommon instead of libpgport. The CRC code has nothing to do with portability, so libpgcommon is a better home. (libpgcommon didn't exist when pg_crc.c was originally moved to src/port.) Remove the possibility to get CRC functionality by just #including pg_crc_tables.h. I'm not aware of any extensions that actually did that and couldn't simply link with libpgcommon. This also moves the pg_crc.h header file from src/include/utils to src/include/common, which will require changes to any external programs that currently does #include "utils/pg_crc.h". That seems acceptable, as include/common is clearly the right home for it now, and the change needed to any such programs is trivial.	2015-02-09 11:17:56 +02:00
Fujii Masao	40bede5477	Move pg_lzcompress.c to src/common. The meta data of PGLZ symbolized by PGLZ_Header is removed, to make the compression and decompression code independent on the backend-only varlena facility. PGLZ_Header is being used to store some meta data related to the data being compressed like the raw length of the uncompressed record or some varlena-related data, making it unpluggable once PGLZ is stored in src/common as it contains some backend-only code paths with the management of varlena structures. The APIs of PGLZ are reworked at the same time to do only compression and decompression of buffers without the meta-data layer, simplifying its use for a more general usage. On-disk format is preserved as well, so there is no incompatibility with previous major versions of PostgreSQL for TOAST entries. Exposing compression and decompression APIs of pglz makes possible its use by extensions and contrib modules. Especially this commit is required for upcoming WAL compression feature so that the WAL reader facility can decompress the WAL data by using pglz_decompress. Michael Paquier, reviewed by me.	2015-02-09 15:15:24 +09:00
Noah Misch	237795a7b4	Check DCH_MAX_ITEM_SIZ limits with <=, not <. We reserve space for the full amount, not one less. The affected checks deal with localized month and day names. Today's DCH_MAX_ITEM_SIZ value would suffice for a 60-byte day name, while the longest known is the 49-byte mn_CN.utf-8 word for "Saturday." Thus, the upshot of this change is merely to avoid misdirecting future readers of the code; users are not expected to see errors either way.	2015-02-06 23:39:52 -05:00
Noah Misch	a7a4adcf8d	Assert(PqCommReadingMsg) in pq_peekbyte(). Interrupting pq_recvbuf() can break protocol sync, so its callers all deserve this assertion. The one pq_peekbyte() caller suffices already.	2015-02-06 23:14:27 -05:00
Heikki Linnakangas	ff16b40f8c	Report WAL flush, not insert, position in replication IDENTIFY_SYSTEM When beginning streaming replication, the client usually issues the IDENTIFY_SYSTEM command, which used to return the current WAL insert position. That's not suitable for the intended purpose of that field, however. pg_receivexlog uses it to start replication from the reported point, but if it hasn't been flushed to disk yet, it will fail. Change IDENTIFY_SYSTEM to report the flush position instead. Backpatch to 9.1 and above. 9.0 doesn't report any WAL position.	2015-02-06 11:26:50 +02:00
Michael Meskes	5ee5bc3873	This routine was calling ecpg_alloc to allocate to memory but did not actually check the returned pointer allocated, potentially NULL which could be the result of a malloc call. Issue noted by Coverity, fixed by Michael Paquier <michael@otacoo.com>	2015-02-05 15:12:34 +01:00
Heikki Linnakangas	d88976cfa1	Use a separate memory context for GIN scan keys. It was getting tedious to track and release all the different things that form a scan key. We were leaking at least the queryCategories array, and possibly more, on a rescan. That was visible if a GIN index was used in a nested loop join. This also protects from leaks in extractQuery method. No backpatching, given the lack of complaints from the field. Maybe later, after this has received more field testing.	2015-02-04 17:40:25 +02:00
Heikki Linnakangas	57fe246890	Fix reference-after-free when waiting for another xact due to constraint. If an insertion or update had to wait for another transaction to finish, because there was another insertion with conflicting key in progress, we would pass a just-free'd item pointer to XactLockTableWait(). All calls to XactLockTableWait() and MultiXactIdWait() had similar issues. Some passed a pointer to a buffer in the buffer cache, after already releasing the lock. The call in EvalPlanQualFetch had already released the pin too. All but the call in execUtils.c would merely lead to reporting a bogus ctid, however (or an assertion failure, if enabled). All the callers that passed HeapTuple->t_data->t_ctid were slightly bogus anyway: if the tuple was updated (again) in the same transaction, its ctid field would point to the next tuple in the chain, not the tuple itself. Backpatch to 9.4, where the 'ctid' argument to XactLockTableWait was added (in commit `f88d4cfc`)	2015-02-04 16:00:34 +02:00
Heikki Linnakangas	c31b5d9ddf	Fix memory leaks on OOM in ecpg. These are fairly obscure cases, but let's keep Coverity happy. Michael Paquier with some further fixes by me.	2015-02-04 14:55:30 +02:00
Andres Freund	ff8ca3b04c	Add missing float.h include to snprintf.c. On windows _isnan() (which isnan() is redirected to in port/win32.h) is declared in float.h, not math.h. Per buildfarm animal currawong. Backpatch to all supported branches.	2015-02-04 13:27:31 +01:00
Heikki Linnakangas	302262d521	Add dummy PQsslAttributes function for non-SSL builds. All the other new SSL information functions had dummy versions in be-secure.c, but I missed PQsslAttributes(). Oops. Surprisingly, the linker did not complain about the missing function on most platforms represented in the buildfarm, even though it is exported, except for a few Windows systems.	2015-02-04 09:13:15 +02:00
Andres Freund	3a54f4a494	Remove ill-conceived Assertion in ProcessClientWriteInterrupt(). It's perfectly fine to have blocked interrupts when ProcessClientWriteInterrupt() is called. In fact it's commonly the case when emitting error reports. And we deal with that correctly. Even if that'd not be the case, it'd be a bad location for such a assertion. Because ProcessClientWriteInterrupt() is only called when the socket is blocked it's hard to hit. Per Heikki and buildfarm animals nightjar and dunlin.	2015-02-03 23:52:15 +01:00
Andres Freund	2505ce0be0	Remove remnants of ImmediateInterruptOK handling. Now that nothing sets ImmediateInterruptOK to true anymore, we can remove all the supporting code. Reviewed-By: Heikki Linnakangas	2015-02-03 23:25:47 +01:00
Andres Freund	d06995710b	Remove the option to service interrupts during PGSemaphoreLock(). The remaining caller (lwlocks) doesn't need that facility, and we plan to remove ImmedidateInterruptOK entirely. That means that interrupts can't be serviced race-free and portably anyway, so there's little reason for keeping the feature. Reviewed-By: Heikki Linnakangas	2015-02-03 23:25:00 +01:00
Andres Freund	6753333f55	Move deadlock and other interrupt handling in proc.c out of signal handlers. Deadlock checking was performed inside signal handlers up to now. While it's a remarkable feat to have made this work reliably, it's quite complex to understand why that is the case. Partially it worked due to the assumption that semaphores are signal safe - which is not actually documented to be the case for sysv semaphores. The reason we had to rely on performing this work inside signal handlers is that semaphores aren't guaranteed to be interruptable by signals on all platforms. But now that latches provide a somewhat similar API, which actually has the guarantee of being interruptible, we can avoid doing so. Signalling between ProcSleep, ProcWakeup, ProcWaitForSignal and ProcSendSignal is now done using latches. This increases the likelihood of spurious wakeups. As spurious wakeup already were possible and aren't likely to be frequent enough to be an actual problem, this seems acceptable. This change would allow for further simplification of the deadlock checking, now that it doesn't have to run in a signal handler. But even if I were motivated to do so right now, it would still be better to do that separately. Such a cleanup shouldn't have to be reviewed a the same time as the more fundamental changes in this commit. There is one possible usability regression due to this commit. Namely it is more likely than before that log_lock_waits messages are output more than once. Reviewed-By: Heikki Linnakangas	2015-02-03 23:24:38 +01:00
Andres Freund	6647248e37	Don't allow immediate interrupts during authentication anymore. We used to handle authentication_timeout by setting ImmediateInterruptOK to true during large parts of the authentication phase of a new connection. While that happens to work acceptably in practice, it's not particularly nice and has ugly corner cases. Previous commits converted the FE/BE communication to use latches and implemented support for interrupt handling during both send/recv. Building on top of that work we can get rid of ImmediateInterruptOK during authentication, by immediately treating timeouts during authentication as a reason to die. As die interrupts are handled immediately during client communication that provides a sensibly quick reaction time to authentication timeout. Additionally add a few CHECK_FOR_INTERRUPTS() to some more complex authentication methods. More could be added, but this already should provides a reasonable coverage. While it this overall increases the maximum time till a timeout is reacted to, it greatly reduces complexity and increases reliability. That seems like a overall win. If the increase proves to be noticeable we can deal with those cases by moving to nonblocking network code and add interrupt checking there. Reviewed-By: Heikki Linnakangas	2015-02-03 22:54:48 +01:00
Tom Lane	cec916f35b	Remove unused "m" field in LSEG. This field has been unreferenced since 1998, and does not appear in lseg values stored on disk (since sizeof(lseg) is only 32 bytes according to pg_type). There was apparently some idea of maintaining it just in values appearing in memory, but the bookkeeping required to make that work would surely far outweigh the cost of recalculating the line's slope when needed. Remove it to (a) simplify matters and (b) suppress some uninitialized-field whining from Coverity.	2015-02-03 16:53:32 -05:00
Andres Freund	4fe384bd85	Process 'die' interrupts while reading/writing from the client socket. Up to now it was impossible to terminate a backend that was trying to send/recv data to/from the client when the socket's buffer was already full/empty. While the send/recv calls itself might have gotten interrupted by signals on some platforms, we just immediately retried. That could lead to situations where a backend couldn't be terminated , after a client died without the connection being closed, because it was blocked in send/recv. The problem was far more likely to be hit when sending data than when reading. That's because while reading a command from the client, and during authentication, we processed interrupts immediately . That primarily left COPY FROM STDIN as being problematic for recv. Change things so that that we process 'die' events immediately when the appropriate signal arrives. We can't sensibly react to query cancels at that point, because we might loose sync with the client as we could be in the middle of writing a message. We don't interrupt writes if the write buffer isn't full, as indicated by write() returning EWOULDBLOCK, as that would lead to fewer error messages reaching clients. Per discussion with Kyotaro HORIGUCHI and Heikki Linnakangas Discussion: 20140927191243.GD5423@alap3.anarazel.de	2015-02-03 22:45:45 +01:00
Andres Freund	4f85fde8eb	Introduce and use infrastructure for interrupt processing during client reads. Up to now large swathes of backend code ran inside signal handlers while reading commands from the client, to allow for speedy reaction to asynchronous events. Most prominently shared invalidation and NOTIFY handling. That means that complex code like the starting/stopping of transactions is run in signal handlers... The required code was fragile and verbose, and is likely to contain bugs. That approach also severely limited what could be done while communicating with the client. As the read might be from within openssl it wasn't safely possible to trigger an error, e.g. to cancel a backend in idle-in-transaction state. We did that in some cases, namely fatal errors, nonetheless. Now that FE/BE communication in the backend employs non-blocking sockets and latches to block, we can quite simply interrupt reads from signal handlers by setting the latch. That allows us to signal an interrupted read, which is supposed to be retried after returning from within the ssl library. As signal handlers now only need to set the latch to guarantee timely interrupt processing, remove a fair amount of complicated & fragile code from async.c and sinval.c. We could now actually start to process some kinds of interrupts, like sinval ones, more often that before, but that seems better done separately. This work will hopefully allow to handle cases like being blocked by sending data, interrupting idle transactions and similar to be implemented without too much effort. In addition to allowing getting rid of ImmediateInterruptOK, that is. Author: Andres Freund Reviewed-By: Heikki Linnakangas	2015-02-03 22:25:20 +01:00
Andres Freund	387da18874	Use a nonblocking socket for FE/BE communication and block using latches. This allows to introduce more elaborate handling of interrupts while reading from a socket. Currently some interrupt handlers have to do significant work from inside signal handlers, and it's very hard to correctly write code to do so. Generic signal handler limitations, combined with the fact that we can't safely jump out of a signal handler while reading from the client have prohibited implementation of features like timeouts for idle-in-transaction. Additionally we use the latch code to wait in a couple places where we previously only had waiting code on windows as other platforms just busy looped. This can increase the number of systemcalls happening during FE/BE communication. Benchmarks so far indicate that the impact isn't very high, and there's room for optimization in the latch code. The chance of cleaning up the usage of latches gives us, seem to outweigh the risk of small performance regressions. This commit theoretically can't used without the next patch in the series, as WaitLatchOrSocket is not defined to be fully signal safe. As we already do that in some cases though, it seems better to keep the commits separate, so they're easier to understand. Author: Andres Freund Reviewed-By: Heikki Linnakangas	2015-02-03 22:03:48 +01:00
Tom Lane	778d498c7d	Fix breakage in GEODEBUG debug code. LINE doesn't have an "m" field (anymore anyway). Also fix unportable assumption that %x can print the result of pointer subtraction. In passing, improve single_decode() in minor ways: * Remove unnecessary leading-whitespace skip (strtod does that already). * Make GEODEBUG message more intelligible. * Remove entirely-useless test to see if strtod returned a silly pointer. * Don't bother computing trailing-whitespace skip unless caller wants an ending pointer. This has been broken since `261c7d4b65`. Although it's only debug code, might as well fix the 9.4 branch too.	2015-02-03 15:20:45 -05:00
Heikki Linnakangas	91fa7b4719	Add API functions to libpq to interrogate SSL related stuff. This makes it possible to query for things like the SSL version and cipher used, without depending on OpenSSL functions or macros. That is a good thing if we ever get another SSL implementation. PQgetssl() still works, but it should be considered as deprecated as it only works with OpenSSL. In particular, PQgetSslInUse() should be used to check if a connection uses SSL, because as soon as we have another implementation, PQgetssl() will return NULL even if SSL is in use.	2015-02-03 19:57:52 +02:00
Heikki Linnakangas	809d9a260b	Refactor page compactifying code. The logic to compact away removed tuples from page was duplicated with small differences in PageRepairFragmentation, PageIndexMultiDelete, and PageIndexDeleteNoCompact. Put it into a common function. Reviewed by Peter Geoghegan.	2015-02-03 14:09:29 +02:00
Heikki Linnakangas	efba7a542f	Fix typo in comment. Amit Langote	2015-02-03 09:49:07 +02:00
Robert Haas	5d2f957f3f	Add new function BackgroundWorkerInitializeConnectionByOid. Sometimes it's useful for a background worker to be able to initialize its database connection by OID rather than by name, so provide a way to do that.	2015-02-02 16:23:59 -05:00
Heikki Linnakangas	2b3a8b20c2	Be more careful to not lose sync in the FE/BE protocol. If any error occurred while we were in the middle of reading a protocol message from the client, we could lose sync, and incorrectly try to interpret a part of another message as a new protocol message. That will usually lead to an "invalid frontend message" error that terminates the connection. However, this is a security issue because an attacker might be able to deliberately cause an error, inject a Query message in what's supposed to be just user data, and have the server execute it. We were quite careful to not have CHECK_FOR_INTERRUPTS() calls or other operations that could ereport(ERROR) in the middle of processing a message, but a query cancel interrupt or statement timeout could nevertheless cause it to happen. Also, the V2 fastpath and COPY handling were not so careful. It's very difficult to recover in the V2 COPY protocol, so we will just terminate the connection on error. In practice, that's what happened previously anyway, as we lost protocol sync. To fix, add a new variable in pqcomm.c, PqCommReadingMsg, that is set whenever we're in the middle of reading a message. When it's set, we cannot safely ERROR out and continue running, because we might've read only part of a message. PqCommReadingMsg acts somewhat similarly to critical sections in that if an error occurs while it's set, the error handler will force the connection to be terminated, as if the error was FATAL. It's not implemented by promoting ERROR to FATAL in elog.c, like ERROR is promoted to PANIC in critical sections, because we want to be able to use PG_TRY/CATCH to recover and regain protocol sync. pq_getmessage() takes advantage of that to prevent an OOM error from terminating the connection. To prevent unnecessary connection terminations, add a holdoff mechanism similar to HOLD/RESUME_INTERRUPTS() that can be used hold off query cancel interrupts, but still allow die interrupts. The rules on which interrupts are processed when are now a bit more complicated, so refactor ProcessInterrupts() and the calls to it in signal handlers so that the signal handlers always call it if ImmediateInterruptOK is set, and ProcessInterrupts() can decide to not do anything if the other conditions are not met. Reported by Emil Lenngren. Patch reviewed by Noah Misch and Andres Freund. Backpatch to all supported versions. Security: CVE-2015-0244	2015-02-02 17:09:53 +02:00
Bruce Momjian	29725b3db6	port/snprintf(): fix overflow and do padding Prevent port/snprintf() from overflowing its local fixed-size buffer and pad to the desired number of digits with zeros, even if the precision is beyond the ability of the native sprintf(). port/snprintf() is only used on systems that lack a native snprintf(). Reported by Bruce Momjian. Patch by Tom Lane. Backpatch to all supported versions. Security: CVE-2015-0242	2015-02-02 10:00:45 -05:00
Bruce Momjian	9241c84cbc	to_char(): prevent writing beyond the allocated buffer Previously very long localized month and weekday strings could overflow the allocated buffers, causing a server crash. Reported and patch reviewed by Noah Misch. Backpatch to all supported versions. Security: CVE-2015-0241	2015-02-02 10:00:45 -05:00
Bruce Momjian	0150ab567b	to_char(): prevent accesses beyond the allocated buffer Previously very long field masks for floats could access memory beyond the existing buffer allocated to hold the result. Reported by Andres Freund and Peter Geoghegan. Backpatch to all supported versions. Security: CVE-2015-0241	2015-02-02 10:00:44 -05:00
Peter Eisentraut	f8948616c9	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 19c72ea8d856d7b1d4f5d759a766c8206bf9ce53	2015-02-01 23:23:40 -05:00
Tom Lane	b7d254c079	Fix documentation of psql's ECHO all mode. "ECHO all" is ignored for interactive input, and has been for a very long time, though possibly not for as long as the documentation has claimed the opposite. Fix that, and also note that empty lines aren't echoed, which while dubious is another longstanding behavior (it's embedded in our regression test files for one thing). Per bug #12721 from Hans Ginzel. In HEAD, also improve the code comments in this area, and suppress an unnecessary fflush(stdout) when we're not echoing. That would likely be safe to back-patch, but I'll not risk it mere hours before a release wrap.	2015-01-31 18:35:13 -05:00
Tom Lane	08bd0c5811	Update time zone data files to tzdata release 2015a. DST law changes in Chile and Mexico (state of Quintana Roo). Historical changes for Iceland.	2015-01-30 22:45:44 -05:00
Tom Lane	451d280815	Fix jsonb Unicode escape processing, and in consequence disallow \u0000. We've been trying to support \u0000 in JSON values since commit `78ed8e03c6`, and have introduced increasingly worse hacks to try to make it work, such as commit `0ad1a81632`. However, it fundamentally can't work in the way envisioned, because the stored representation looks the same as for \\u0000 which is not the same thing at all. It's also entirely bogus to output \u0000 when de-escaped output is called for. The right way to do this would be to store an actual 0x00 byte, and then throw error only if asked to produce de-escaped textual output. However, getting to that point seems likely to take considerable work and may well never be practical in the 9.4.x series. To preserve our options for better behavior while getting rid of the nasty side-effects of `0ad1a81632`, revert that commit in toto and instead throw error if \u0000 is used in a context where it needs to be de-escaped. (These are the same contexts where non-ASCII Unicode escapes throw error if the database encoding isn't UTF8, so this behavior is by no means without precedent.) In passing, make both the \u0000 case and the non-ASCII Unicode case report ERRCODE_UNTRANSLATABLE_CHARACTER / "unsupported Unicode escape sequence" rather than claiming there's something wrong with the input syntax. Back-patch to 9.4, where we have to do something because `0ad1a81632` broke things for many cases having nothing to do with \u0000. 9.3 also has bogus behavior, but only for that specific escape value, so given the lack of field complaints it seems better to leave 9.3 alone.	2015-01-30 14:44:56 -05:00
Robert Haas	bd4e2fd97d	Provide a way to supress the "out of memory" error when allocating. Using the new interface MemoryContextAllocExtended, callers can specify MCXT_ALLOC_NO_OOM if they are prepared to handle a NULL return value. Michael Paquier, reviewed and somewhat revised by me.	2015-01-30 12:56:48 -05:00
Tom Lane	3d660d33aa	Fix assorted oversights in range selectivity estimation. calc_rangesel() failed outright when comparing range variables to empty constant ranges with < or >=, as a result of missing cases in a switch. It also produced a bogus estimate for > comparison to an empty range. On top of that, the >= and > cases were mislabeled throughout. For nonempty constant ranges, they managed to produce the right answers anyway as a result of counterbalancing typos. Also, default_range_selectivity() omitted cases for elem <@ range, range &< range, and range &> range, so that rather dubious defaults were applied for these operators. In passing, rearrange the code in rangesel() so that the elem <@ range case is handled in a less opaque fashion. Report and patch by Emre Hasegeli, some additional work by me	2015-01-30 12:30:59 -05:00
Heikki Linnakangas	68fa75f318	Fix query-duration memory leak with GIN rescans. The requiredEntries / additionalEntries arrays were not freed in freeScanKeys() like other per-key stuff. It's not obvious, but startScanKey() was only ever called after the keys have been initialized with ginNewScanKey(). That's why it doesn't need to worry about freeing existing arrays. The ginIsNewKey() test in gingetbitmap was never true, because ginrescan free's the existing keys, and it's not OK to call gingetbitmap twice in a row without calling ginrescan in between. To make that clear, remove the unnecessary ginIsNewKey(). And just to be extra sure that nothing funny happens if there is an existing key after all, call freeScanKeys() to free it if it exists. This makes the code more straightforward. (I'm seeing other similar leaks in testing a query that rescans an GIN index scan, but that's a different issue. This just fixes the obvious leak with those two arrays.) Backpatch to 9.4, where GIN fast scan was added.	2015-01-30 17:58:23 +01:00
Kevin Grittner	cff1bd2a3c	Allow pg_dump to use jobs and serializable transactions together. Since 9.3, when the --jobs option was introduced, using it together with the --serializable-deferrable option generated multiple errors. We can get correct behavior by allowing the connection which acquires the snapshot to use SERIALIZABLE, READ ONLY, DEFERRABLE and pass that to the workers running the other connections using REPEATABLE READ, READ ONLY. This is a bit of a kluge since the SERIALIZABLE behavior is achieved by running some of the participating connections at a different isolation level, but it is a simple and safe change, suitable for back-patching. This will be followed by a proposal for a more invasive fix with some slight behavioral changes on just the master branch, based on suggestions from Andres Freund, but the kluge will be applied to master until something is agreed along those lines. Back-patched to 9.3, where the --jobs option was added. Based on report from Alexander Korotkov	2015-01-30 08:57:24 -06:00
Stephen Frost	32bf6ee6ab	Fix BuildIndexValueDescription for expressions In `804b6b6db4` we modified BuildIndexValueDescription to pay attention to which columns are visible to the user, but unfortunatley that commit neglected to consider indexes which are built on expressions. Handle error-reporting of violations of constraint indexes based on expressions by not returning any detail when the user does not have table-level SELECT rights. Backpatch to 9.0, as the prior commit was. Pointed out by Tom.	2015-01-29 21:59:34 -05:00
Andres Freund	17792bfc5b	Properly terminate the array returned by GetLockConflicts(). GetLockConflicts() has for a long time not properly terminated the returned array. During normal processing the returned array is zero initialized which, while not pretty, is sufficient to be recognized as a invalid virtual transaction id. But the HotStandby case is more than aesthetically broken: The allocated (and reused) array is neither zeroed upon allocation, nor reinitialized, nor terminated. Not having a terminating element means that the end of the array will not be recognized and that recovery conflict handling will thus read ahead into adjacent memory. Only terminating when hitting memory content that looks like a invalid virtual transaction id. Luckily this seems so far not have caused significant problems, besides making recovery conflict more expensive. Discussion: 20150127142713.GD29457@awork2.anarazel.de Backpatch into all supported branches.	2015-01-29 22:48:45 +01:00
Andres Freund	ed127002d8	Align buffer descriptors to cache line boundaries. Benchmarks has shown that aligning the buffer descriptor array to cache lines is important for scalability; especially on bigger, multi-socket, machines. Currently the array sometimes already happens to be aligned by happenstance, depending how large previous shared memory allocations were. That can lead to wildly varying performance results after minor configuration changes. In addition to aligning the start of descriptor array, also force the size of individual descriptors to be of a common cache line size (64 bytes). That happens to already be the case on 64bit platforms, but this way we can change the struct BufferDesc more easily. As the alignment primarily matters in highly concurrent workloads which probably all are 64bit these days, and the space wastage of element alignment would be a bit more noticeable on 32bit systems, we don't force the stride to be cacheline sized on 32bit platforms for now. If somebody does actual performance testing, we can reevaluate that decision by changing the definition of BUFFERDESC_PADDED_SIZE. Discussion: 20140202151319.GD32123@awork2.anarazel.de Per discussion with Bruce Momjan, Tom Lane, Robert Haas, and Peter Geoghegan.	2015-01-29 22:48:45 +01:00
Andres Freund	7142bfbbd3	Fix #ifdefed'ed out code to compile again.	2015-01-29 22:48:45 +01:00
Heikki Linnakangas	31ed42b9a3	Fix bug where GIN scan keys were not initialized with gin_fuzzy_search_limit. When gin_fuzzy_search_limit was used, we could jump out of startScan() without calling startScanKey(). That was harmless in 9.3 and below, because startScanKey()() didn't do anything interesting, but in 9.4 it initializes information needed for skipping entries (aka GIN fast scans), and you readily get a segfault if it's not done. Nevertheless, it was clearly wrong all along, so backpatch all the way to 9.1 where the early return was introduced. (AFAICS startScanKey() did nothing useful in 9.3 and below, because the fields it initialized were already initialized in ginFillScanKey(), but I don't dare to change that in a minor release. ginFillScanKey() is always called in gingetbitmap() even though there's a check there to see if the scan keys have already been initialized, because they never are; ginrescan() free's them.) In the passing, remove unnecessary if-check from the second inner loop in startScan(). We already check in the first loop that the condition is true for all entries. Reported by Olaf Gawenda, bug #12694, Backpatch to 9.1 and above, although AFAICS it causes a live bug only in 9.4.	2015-01-29 19:35:55 +02:00
Robert Haas	3d6d1b5855	Move out-of-memory error checks from aset.c to mcxt.c This potentially allows us to add mcxt.c interfaces that do something other than throw an error when memory cannot be allocated. We'll handle adding those interfaces in a separate commit. Michael Paquier, with minor changes by me	2015-01-29 10:23:38 -05:00
Stephen Frost	c7cf9a2433	Add usebypassrls to pg_user and pg_shadow The row level security patches didn't add the 'usebypassrls' columns to the pg_user and pg_shadow views on the belief that they were deprecated, but we havn't actually said they are and therefore we should include it. This patch corrects that, adds missing documentation for rolbypassrls into the system catalog page for pg_authid, along with the entries for pg_user and pg_shadow, and cleans up a few other uses of 'row-level' cases to be 'row level' in the docs. Pointed out by Amit Kapila. Catalog version bump due to system view changes.	2015-01-28 21:47:15 -05:00
Stephen Frost	f8519a6a46	Clean up range-table building in copy.c Commit `804b6b6db4` added the build of a range table in copy.c to initialize the EState es_range_table since it can be needed in error paths. Unfortunately, that commit didn't appreciate that some code paths might end up not initializing the rte which is used to build the range table. Fix that and clean up a couple others things along the way- build it only once and don't explicitly set it on the !is_from path as it doesn't make any sense there (cstate is palloc0'd, so this isn't an issue from an initializing standpoint either). The prior commit went back to 9.0, but this only goes back to 9.1 as prior to that the range table build happens immediately after building the RTE and therefore doesn't suffer from this issue. Pointed out by Robert.	2015-01-28 17:42:28 -05:00
Stephen Frost	804b6b6db4	Fix column-privilege leak in error-message paths While building error messages to return to the user, BuildIndexValueDescription, ExecBuildSlotValueDescription and ri_ReportViolation would happily include the entire key or entire row in the result returned to the user, even if the user didn't have access to view all of the columns being included. Instead, include only those columns which the user is providing or which the user has select rights on. If the user does not have any rights to view the table or any of the columns involved then no detail is provided and a NULL value is returned from BuildIndexValueDescription and ExecBuildSlotValueDescription. Note that, for key cases, the user must have access to all of the columns for the key to be shown; a partial key will not be returned. Further, in master only, do not return any data for cases where row security is enabled on the relation and row security should be applied for the user. This required a bit of refactoring and moving of things around related to RLS- note the addition of utils/misc/rls.c. Back-patch all the way, as column-level privileges are now in all supported versions. This has been assigned CVE-2014-8161, but since the issue and the patch have already been publicized on pgsql-hackers, there's no point in trying to hide this commit.	2015-01-28 12:31:30 -05:00
Heikki Linnakangas	acc2b1e843	Fix typo in comment.	2015-01-28 10:26:30 +02:00
Heikki Linnakangas	670bf71f65	Remove dead NULL-pointer checks in GiST code. gist_poly_compress() and gist_circle_compress() checked for a NULL-pointer key argument, but that was dead code; the gist code never passes a NULL-pointer to the "compress" method. This commit also removes a documentation note added in commit `a0a3883`, about doing NULL-pointer checks in the "compress" method. It was added based on the fact that some implementations were doing NULL-pointer checks, but those checks were unnecessary in the first place. The NULL-pointer check in gbt_var_same() function was also unnecessary. The arguments to the "same" method come from the "compress", "union", or "picksplit" methods, but none of them return a NULL pointer. None of this is to be confused with SQL NULL values. Those are dealt with by the gist machinery, and are never passed to the GiST opclass methods. Michael Paquier	2015-01-28 10:03:58 +02:00
Tom Lane	1a2b2034d4	Fix NUMERIC field access macros to treat NaNs consistently. Commit `145343534c` arranged to store numeric NaN values as short-header numerics, but the field access macros did not get the memo: they thought only "SHORT" numerics have short headers. Most of the time this makes no difference because we don't access the weight or dscale of a NaN; but numeric_send does that. As pointed out by Andrew Gierth, this led to fetching uninitialized bytes. AFAICS this could not have any worse consequences than that; in particular, an unaligned stored numeric would have been detoasted by PG_GETARG_NUMERIC, so that there's no risk of a fetch off the end of memory. Still, the code is wrong on its own terms, and it's not hard to foresee future changes that might expose us to real risks. So back-patch to all affected branches.	2015-01-27 12:06:31 -05:00
Tom Lane	4b2a254793	Add a note to PG_TRY's documentation about volatile safety. We had better memorialize what the actual requirements are for this.	2015-01-26 15:53:37 -05:00
Robert Haas	168a809d4b	Re-enable abbreviated keys on Windows. Commit `1be4eb1b2d` disabled this, but I think the real problem here was fixed by commit `b181a91981` and commit `d060e07fa9`. So let's try re-enabling it now and see what happens.	2015-01-26 14:28:14 -05:00
Tom Lane	599d00aa68	Fix volatile-safety issue in pltcl_SPI_execute_plan(). The "callargs" variable is modified within PG_TRY and then referenced within PG_CATCH, which is exactly the coding pattern we've now found to be unsafe. Marking "callargs" volatile would be problematic because it is passed by reference to some Tcl functions, so fix the problem by not modifying it within PG_TRY. We can just postpone the free() till we exit the PG_TRY construct, as is already done elsewhere in this same file. Also, fix failure to free(callargs) when exiting on too-many-arguments error. This is only a minor memory leak, but a leak nonetheless. In passing, remove some unnecessary "volatile" markings in the same function. Those doubtless are there because gcc 2.95.3 whinged about them, but we now know that its algorithm for complaining is many bricks shy of a load. This is certainly a live bug with compilers that optimize similarly to current gcc, so back-patch to all active branches.	2015-01-26 12:18:25 -05:00
Tom Lane	c58accd70b	Fix volatile-safety issue in asyncQueueReadAllNotifications(). The "pos" variable is modified within PG_TRY and then referenced within PG_CATCH, so for strict POSIX conformance it must be marked volatile. Superficially the code looked safe because pos's address was taken, which was sufficient to force it into memory ... but it's not sufficient to ensure that the compiler applies updates exactly where the program text says to. The volatility marking has to extend into a couple of subroutines too, but I think that's probably a good thing because the risk of out-of-order updates is mostly in those subroutines not asyncQueueReadAllNotifications() itself. In principle the compiler could have re-ordered operations such that an error could be thrown while "pos" had an incorrect value. It's unclear how real the risk is here, but for safety back-patch to all active branches.	2015-01-26 11:57:33 -05:00
Tom Lane	c70f9e8988	Further cleanup of ReorderBufferCommit(). On closer inspection, we can remove the "volatile" qualifier on "using_subtxn" so long as we initialize that before the PG_TRY block, which there's no particularly good reason not to do. Also, push the "change" variable inside the PG_TRY so as to remove all question of whether it needs "volatile", and remove useless early initializations of "snapshow_now" and "using_subtxn".	2015-01-25 22:49:56 -05:00
Tom Lane	bf007a27ac	Clean up assorted issues in ALTER SYSTEM coding. Fix unsafe use of a non-volatile variable in PG_TRY/PG_CATCH in AlterSystemSetConfigFile(). While at it, clean up a bundle of other infelicities and outright bugs, including corner-case-incorrect linked list manipulation, a poorly designed and worse documented parse-and-validate function (which even included some randomly chosen hard-wired substitutes for the specified elevel in one code path ... wtf?), direct use of open() instead of fd.c's facilities, inadequate checking of write()'s return value, and generally poorly written commentary.	2015-01-25 20:19:04 -05:00
Tom Lane	fd496129d1	Clean up some mess in row-security patches. Fix unsafe coding around PG_TRY in RelationBuildRowSecurity: can't change a variable inside PG_TRY and then use it in PG_CATCH without marking it "volatile". In this case though it seems saner to avoid that by doing a single assignment before entering the TRY block. I started out just intending to fix that, but the more I looked at the row-security code the more distressed I got. This patch also fixes incorrect construction of the RowSecurityPolicy cache entries (there was not sufficient care taken to copy pass-by-ref data into the cache memory context) and a whole bunch of sloppiness around the definition and use of pg_policy.polcmd. You can't use nulls in that column because initdb will mark it NOT NULL --- and I see no particular reason why a null entry would be a good idea anyway, so changing initdb's behavior is not the right answer. The internal value of '\0' wouldn't be suitable in a "char" column either, so after a bit of thought I settled on using '*' to represent ALL. Chasing those changes down also revealed that somebody wasn't paying attention to what the underlying values of ACL_UPDATE_CHR etc really were, and there was a great deal of lackadaiscalness in the catalogs.sgml documentation for pg_policy and pg_policies too. This doesn't pretend to be a complete code review for the row-security stuff, it just fixes the things that were in my face while dealing with the bugs in RelationBuildRowSecurity.	2015-01-24 16:16:22 -05:00
Tom Lane	f8a4dd2e14	Fix unsafe coding in ReorderBufferCommit(). "iterstate" must be marked volatile since it's changed inside the PG_TRY block and then used in the PG_CATCH stanza. Noted by Mark Wilding of Salesforce. (We really need to see if we can't get the C compiler to warn about this.) Also, reset iterstate to NULL after the mainline ReorderBufferIterTXNFinish call, to ensure the PG_CATCH block doesn't try to do that a second time.	2015-01-24 13:25:19 -05:00
Tom Lane	586dd5d6a5	Replace a bunch more uses of strncpy() with safer coding. strncpy() has a well-deserved reputation for being unsafe, so make an effort to get rid of nearly all occurrences in HEAD. A large fraction of the remaining uses were passing length less than or equal to the known strlen() of the source, in which case no null-padding can occur and the behavior is equivalent to memcpy(), though doubtless slower and certainly harder to reason about. So just use memcpy() in these cases. In other cases, use either StrNCpy() or strlcpy() as appropriate (depending on whether padding to the full length of the destination buffer seems useful). I left a few strncpy() calls alone in the src/timezone/ code, to keep it in sync with upstream (the IANA tzcode distribution). There are also a few such calls in ecpg that could possibly do with more analysis. AFAICT, none of these changes are more than cosmetic, except for the four occurrences in fe-secure-openssl.c, which are in fact buggy: an overlength source leads to a non-null-terminated destination buffer and ensuing misbehavior. These don't seem like security issues, first because no stack clobber is possible and second because if your values of sslcert etc are coming from untrusted sources then you've got problems way worse than this. Still, it's undesirable to have unpredictable behavior for overlength inputs, so back-patch those four changes to all active branches.	2015-01-24 13:05:42 -05:00
Tom Lane	9222cd84b0	Remove no-longer-referenced src/port/gethostname.c. This file hasn't been part of any build since 2005, and even before that wasn't used unless you configured --with-krb4 (and had a machine without gethostname(2), obviously). What's more, we haven't actually called gethostname anywhere since then, either (except in thread_test.c, whose testing of this function is probably pointless). So we don't need it.	2015-01-24 12:13:57 -05:00
Alvaro Herrera	f2789ab84e	Fix assignment operator thinko Pointed out by Michael Paquier	2015-01-24 11:15:56 -03:00
Robert Haas	d1747571b6	Fix typos, update README. Peter Geoghegan	2015-01-23 15:06:53 -05:00
Alvaro Herrera	a179232047	vacuumdb: enable parallel mode This mode allows vacuumdb to open several server connections to vacuum or analyze several tables simultaneously. Author: Dilip Kumar. Some reworking by Álvaro Herrera Reviewed by: Jeff Janes, Amit Kapila, Magnus Hagander, Andres Freund	2015-01-23 15:02:45 -03:00
Robert Haas	5cefbf5a6c	Don't use abbreviated keys for the final merge pass. When we write tuples out to disk and read them back in, the abbreviated keys become non-abbreviated, because the readtup routines don't know anything about abbreviation. But without this fix, the rest of the code still thinks the abbreviation-aware compartor should be used, so chaos ensues. Report by Andrew Gierth; patch by Peter Geoghegan.	2015-01-23 11:58:31 -05:00
Robert Haas	6a3c6ba0ba	Add an explicit cast to Size to hyperloglog.c MSVC generates a warning here; we hope this will make it happy. Report by Michael Paquier. Patch by David Rowley.	2015-01-23 11:44:51 -05:00
Tom Lane	eb213acfe2	Prevent duplicate escape-string warnings when using pg_stat_statements. contrib/pg_stat_statements will sometimes run the core lexer a second time on submitted statements. Formerly, if you had standard_conforming_strings turned off, this led to sometimes getting two copies of any warnings enabled by escape_string_warning. While this is probably no longer a big deal in the field, it's a pain for regression testing. To fix, change the lexer so it doesn't consult the escape_string_warning GUC variable directly, but looks at a copy in the core_yy_extra_type state struct. Then, pg_stat_statements can change that copy to disable warnings while it's redoing the lexing. It seemed like a good idea to make this happen for all three of the GUCs consulted by the lexer, not just escape_string_warning. There's not an immediate use-case for callers to adjust the other two AFAIK, but making it possible is easy enough and seems like good future-proofing. Arguably this is a bug fix, but there doesn't seem to be enough interest to justify a back-patch. We'd not be able to back-patch exactly as-is anyway, for fear of breaking ABI compatibility of the struct. (We could perhaps back-patch the addition of only escape_string_warning by adding it at the end of the struct, where there's currently alignment padding space.)	2015-01-22 18:11:00 -05:00
Peter Eisentraut	f5f2c2de16	Fix whitespace	2015-01-22 16:57:16 -05:00
Alvaro Herrera	972bf7d6f1	Tweak BRIN minmax operator class In the union support proc, we were not checking the hasnulls flag of value A early enough, so it could be skipped if the "allnulls" flag in value B is set. Also, a check on the allnulls flag of value "B" was redundant, so remove it. Also change inet_minmax_ops to not be the default opclass for type inet, as a future inclusion operator class would be more useful and it's pretty difficult to change default opclass for a datatype later on. (There is no catversion bump for this catalog change; this shouldn't be a problem.) Extracted from a larger patch to add an "inclusion" operator class. Author: Emre Hasegeli	2015-01-22 17:01:09 -03:00
Robert Haas	d060e07fa9	Repair brain fade in commit `b181a91981`. The split between which things need to happen in the C-locale case and which needed to happen in the locale-aware case was a few bricks short of a load. Try to fix that.	2015-01-22 12:51:20 -05:00
Bruce Momjian	59367fdf97	adjust ACL owners for REASSIGN and ALTER OWNER TO When REASSIGN and ALTER OWNER TO are used, both the object owner and ACL list should be changed from the old owner to the new owner. This patch fixes types, foreign data wrappers, and foreign servers to change their ACL list properly; they already changed owners properly. BACKWARD INCOMPATIBILITY? Report by Alexey Bashtanov	2015-01-22 12:36:55 -05:00
Robert Haas	b181a91981	More fixes for abbreviated keys infrastructure. First, when LC_COLLATE = C, bttext_abbrev_convert should use memcpy() rather than strxfrm() to construct the abbreviated key, because the authoritative comparator uses memcpy(). If we do anything else here, we might get inconsistent answers, and the buildfarm says this risk is not theoretical. It should be faster this way, too. Second, while I'm looking at bttext_abbrev_convert, convert a needless use of goto into the loop it's trying to implement into an actual loop. Both of the above problems date to the original commit of abbreviated keys, commit `4ea51cdfe8`. Third, fix a bogus assignment to tss->locale before tss is set up. That's a new goof in commit `b529b65d1b`.	2015-01-22 11:58:58 -05:00
Robert Haas	b529b65d1b	Heavily refactor btsortsupport_worker. Prior to commit `4ea51cdfe8`, this function only had one job, which was to decide whether we could avoid trampolining through the fmgr layer when performing sort comparisons. As of that commit, it has a second job, which is to decide whether we can use abbreviated keys. Unfortunately, those two tasks are somewhat intertwined in the existing coding, which is likely why neither Peter Geoghegan nor I noticed prior to commit that this calls pg_newlocale_from_collation() in cases where it didn't previously. The buildfarm noticed, though. To fix, rewrite the logic so that the decision as to which comparator to use is more cleanly separated from the decision about abbreviation.	2015-01-22 10:54:16 -05:00
Alvaro Herrera	813ffc0ef9	reinit.h: Fix typo in identification comment Author: Sawada Masahiko	2015-01-22 12:26:51 -03:00
Robert Haas	1be4eb1b2d	Disable abbreviated keys on Windows. Most of the Windows buildfarm members (bowerbird, hamerkop, currawong, jacana, brolga) are unhappy with yesterday's abbreviated keys patch, although there are some (narwhal, frogmouth) that seem OK with it. Since there's no obvious pattern to explain why some are working and others are failing, just disable this across-the-board on Windows for now. This is a bit unfortunate since the optimization will be a big win in some cases, but we can't leave the buildfarm broken.	2015-01-20 20:32:21 -05:00
Bruce Momjian	f259e71dbe	tools/ccsym: update for modern versions of gcc This dumps the predefined preprocessor macros	2015-01-20 13:02:58 -05:00
Robert Haas	f32a1fa462	Add strxfrm_l to list of functions where Windows adds an underscore. Per buildfarm failure on bowerbird after last night's commit `4ea51cdfe8`. Peter Geoghegan	2015-01-20 10:52:01 -05:00
Tom Lane	aa719391d5	In pg_regress, remove the temporary installation upon successful exit. This results in a very substantial reduction in disk space usage during "make check-world", since that sequence involves creation of numerous temporary installations. It should also help a bit in the buildfarm, even though the buildfarm script doesn't create as many temp installations, because the current script misses deleting some of them; and anyway it seems better to do this once in one place rather than expecting that script to get it right every time. In 9.4 and HEAD, also undo the unwise choice in commit `b1aebbb6a8` to report strerror(errno) after a rmtree() failure. rmtree has already reported that, possibly for multiple failures with distinct errnos; and what's more, by the time it returns there is no good reason to assume that errno still reflects the last reportable error. So reporting errno here is at best redundant and at worst badly misleading. Back-patch to all supported branches, so that future revisions of the buildfarm script can rely on this behavior.	2015-01-19 23:44:19 -05:00
Tom Lane	75b48e1fff	Adjust "pgstat wait timeout" message to be a translatable LOG message. Per discussion, change the log level of this message to be LOG not WARNING. The main point of this change is to avoid causing buildfarm run failures when the stats collector is exceptionally slow to respond, which it not infrequently is on some of the smaller/slower buildfarm members. This change does lose notice to an interactive user when his stats query is looking at out-of-date stats, but the majority opinion (not necessarily that of yours truly) is that WARNING messages would probably not get noticed anyway on heavily loaded production systems. A LOG message at least ensures that the problem is recorded somewhere where bulk auditing for the issue is possible. Also, instead of an untranslated "pgstat wait timeout" message, provide a translatable and hopefully more understandable message "using stale statistics instead of current ones because stats collector is not responding". The original text was written hastily under the assumption that it would never really happen in practice, which we now know to be unduly optimistic. Back-patch to all active branches, since we've seen the buildfarm issue in all branches.	2015-01-19 23:01:33 -05:00
Andres Freund	2d115e47c8	Fix various shortcomings of the new PrivateRefCount infrastructure. As noted by Tom Lane the improvements in `4b4b680c3d` had the problem that in some situations we searched, entered and modified entries in the private refcount hash while holding a spinlock. I had tried to keep the logic entirely local to PinBuffer_Locked(), but that's not really possible given it's called with a spinlock held... Besides being disadvantageous from a performance point of view, this also has problems with error handling safety. If we failed inserting an entry into the hashtable due to an out of memory error, we'd error out with a held spinlock. Not good. Change the way private refcounts are manipulated: Before a buffer can be tracked an entry has to be reserved using ReservePrivateRefCountEntry(); then, if a entry is not found using GetPrivateRefCountEntry(), it can be entered with NewPrivateRefCountEntry(). Also take advantage of the fact that PinBuffer_Locked() currently is never called for buffers that already have been pinned by the current backend and don't search the private refcount entries for preexisting local pins. That results in a small, but measurable, performance improvement. Additionally make ReleaseBuffer() always call UnpinBuffer() for shared buffers. That avoids duplicating work in an eventual UnpinBuffer() call that already has been done in ReleaseBuffer() and also saves some code. Per discussion with Tom Lane. Discussion: 15028.1418772313@sss.pgh.pa.us	2015-01-19 23:59:41 +01:00
Robert Haas	4ea51cdfe8	Use abbreviated keys for faster sorting of text datums. This commit extends the SortSupport infrastructure to allow operator classes the option to provide abbreviated representations of Datums; in the case of text, we abbreviate by taking the first few characters of the strxfrm() blob. If the abbreviated comparison is insufficent to resolve the comparison, we fall back on the normal comparator. This can be much faster than the old way of doing sorting if the first few bytes of the string are usually sufficient to resolve the comparison. There is the potential for a performance regression if all of the strings to be sorted are identical for the first 8+ characters and differ only in later positions; therefore, the SortSupport machinery now provides an infrastructure to abort the use of abbreviation if it appears that abbreviation is producing comparatively few distinct keys. HyperLogLog, a streaming cardinality estimator, is included in this commit and used to make that determination for text. Peter Geoghegan, reviewed by me.	2015-01-19 15:28:27 -05:00
Robert Haas	1605291b6c	Typo fix. Etsuro Fujita	2015-01-19 11:36:48 -05:00
Robert Haas	9d54b93239	BRIN typo fix. Amit Langote	2015-01-19 08:34:29 -05:00
Peter Eisentraut	cb4a3b0410	Install shared libraries also in bin on cygwin, mingw This was previously only done for libpq, not it's done for all shared libraries. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-01-18 22:36:40 -05:00
Tom Lane	75df6dc083	Fix ancient thinko in default table rowcount estimation. The code used sizeof(ItemPointerData) where sizeof(ItemIdData) is correct, since we're trying to account for a tuple's line pointer. Spotted by Tomonari Katsumata (bug #12584). Although this mistake is of very long standing, no back-patch, since it's a relatively harmless error and changing it would risk changing default planner behavior in stable branches. (I don't see any change in regression test outputs here, but the buildfarm may think differently.)	2015-01-18 17:04:11 -05:00
Noah Misch	4c34dcf97f	Activate low-volume optional logging during regression test runs. Elaborated from an idea by Andres Freund.	2015-01-18 14:08:09 -05:00
Andres Freund	525b84c576	Fix use of already freed memory when dumping a database's security label. pg_dump.c:dumDatabase() called ArchiveEntry() with the results of a a query that was PQclear()ed a couple lines earlier. Backpatch to 9.2 where security labels for shared objects where introduced.	2015-01-18 16:04:10 +01:00
Andres Freund	ff44fba46c	Replace walsender's latch with the general shared latch. Relying on the normal shared latch simplifies interrupt/signal handling because we can rely on all signal handlers setting the proc latch. That in turn allows us to avoid the use of ImmediateInterruptOK, which arguably isn't correct because WaitLatchOrSocket isn't declared to be immediately interruptible. Also change sections that wait on the walsender's latch to notice interrupts quicker/more reliably and make them more consistent with each other. This is part of a larger "get rid of ImmediateInterruptOK" series. Discussion: 20150115020335.GZ5245@awork2.anarazel.de	2015-01-17 13:00:42 +01:00
Tom Lane	20af53d719	Show sort ordering options in EXPLAIN output. Up to now, EXPLAIN has contented itself with printing the sort expressions in a Sort or Merge Append plan node. This patch improves that by annotating the sort keys with COLLATE, DESC, USING, and/or NULLS FIRST/LAST whenever nondefault sort ordering options are used. The output is now a reasonably close approximation of an ORDER BY clause equivalent to the plan's ordering. Marius Timmer, Lukas Kreft, and Arne Scheffer; reviewed by Mike Blackwell. Some additional hacking by me.	2015-01-16 18:19:00 -05:00
Heikki Linnakangas	9402869160	Advance backend's advertised xmin more aggressively. Currently, a backend will reset it's PGXACT->xmin value when it doesn't have any registered snapshots left. That covered the common case that a transaction in read committed mode runs several queries, one after each other, as there would be no snapshots active between those queries. However, if you hold cursors across each of the query, we didn't get a chance to reset xmin. To make that better, keep all the registered snapshots in a pairing heap, ordered by xmin so that it's always quick to find the snapshot with the smallest xmin. That allows us to advance PGXACT->xmin whenever the oldest snapshot is deregistered, even if there are others still active. Per discussion originally started by Jeff Davis back in 2009 and more recently by Robert Haas.	2015-01-17 01:15:23 +02:00
Tom Lane	779fdcdeee	Improve new caching logic in tbm_add_tuples(). For no significant extra complexity, we can cache knowledge that the target page is lossy, and save a hash_search per iteration in that case as well. This probably makes little difference, since the extra rechecks that must occur when pages are lossy are way more expensive than anything we can save here ... but we might as well do it if we're going to cache anything.	2015-01-16 13:28:30 -05:00
Andres Freund	f5ae3ba482	Make tbm_add_tuples more efficient by caching the last acccessed page. When adding a large number of tuples to a TID bitmap using tbm_add_tuples() sometimes a lot of time was spent looking up a page's entry in the bitmap's internal hashtable. Improve efficiency by caching the last accessed page, while iterating over the passed in tuples, hoping consecutive tuples will often be on the same page. In many cases that's a good bet, and in the rest the added overhead isn't big. Discussion: 54479A85.8060309@sigaev.ru Author: Teodor Sigaev Reviewed-By: David Rowley	2015-01-16 17:47:59 +01:00
Heikki Linnakangas	aa1d2fc5e9	Another attempt at fixing Windows Norwegian locale. Previous fix mapped "Norwegian (Bokmål)" locale, which contains a non-ASCII character, to the pure ASCII alias "norwegian-bokmal". However, it turns out that more recent versions of the CRT library, in particular MSVCR110 (Visual Studio 2012), changed the behaviour of setlocale() so that if you pass "norwegian-bokmal" to setlocale, it returns "Norwegian_Norway". That meant trouble, when setlocale(..., NULL) first returned "Norwegian (Bokmål)_Norway", which we mapped to "norwegian-bokmal_Norway", but another call to setlocale(..., "norwegian-bokmal_Norway") returned "Norwegian_Norway". That caused PostgreSQL to think that they are different locales, and therefore not compatible. That caused initdb to fail at CREATE DATABASE. Older CRT versions seem to accept "Norwegian_Norway" too, so change the mapping to return "Norwegian_Norway" instead of "norwegian-bokmal". Backpatch to 9.2 like the previous attempt. We haven't made a release that includes the previous fix yet, so we don't need to worry about changing the locale of existing clusters from "norwegian-bokmal" to "Norwegian_Norway". (Doing any mapping like this at all requires changing the locale of existing databases; the release notes need to include instructions for that).	2015-01-16 13:28:19 +02:00
Noah Misch	28df6a0df0	Update "pg_regress --no-locale" for Darwin and Windows. Commit `894459e59f` revealed this option to be broken for NLS builds on Darwin, but "make -C contrib/unaccent check" and the buildfarm client rely on it. Fix that configuration by redefining the option to imply LANG=C on Darwin. In passing, use LANG=C instead of LANG=en on Windows; since only postmaster startup uses that value, testers are unlikely to notice the change. Back-patch to 9.0, like the predecessor commit.	2015-01-16 01:27:31 -05:00
Tom Lane	c480cb9d24	Fix use-of-already-freed-memory problem in EvalPlanQual processing. Up to now, the "child" executor state trees generated for EvalPlanQual rechecks have simply shared the ResultRelInfo arrays used for the original execution tree. However, this leads to dangling-pointer problems, because ExecInitModifyTable() is all too willing to scribble on some fields of the ResultRelInfo(s) even when it's being run in one of those child trees. This trashes those fields from the perspective of the parent tree, because even if the generated subtree is logically identical to what was in use in the parent, it's in a memory context that will go away when we're done with the child state tree. We do however want to share information in the direction from the parent down to the children; in particular, fields such as es_instrument must be shared or we'll lose the stats arising from execution of the children. So the simplest fix is to make a copy of the parent's ResultRelInfo array, but not copy any fields back at end of child execution. Per report from Manuel Kniep. The added isolation test is based on his example. In an unpatched memory-clobber-enabled build it will reliably fail with "ctid is NULL" errors in all branches back to 9.1, as a consequence of junkfilter->jf_junkAttNo being overwritten with $7f7f. This test cannot be run as-is before that for lack of WITH syntax; but I have no doubt that some variant of this problem can arise in older branches, so apply the code change all the way back.	2015-01-15 18:52:58 -05:00
Heikki Linnakangas	49b04188f8	Fix thinko in re-setting wal_log_hints flag from a parameter-change record. The flag is supposed to be copied from the record. Same issue with track_commit_timestamps, but that's master-only. Report and fix by Petr Jalinek. Backpatch to 9.4, where wal_log_hints was added.	2015-01-15 20:52:41 +02:00
Tom Lane	8e166e164c	Rearrange explain.c's API so callers need not embed sizeof(ExplainState). The folly of the previous arrangement was just demonstrated: there's no convenient way to add fields to ExplainState without breaking ABI, even if callers have no need to touch those fields. Since we might well need to do that again someday in back branches, let's change things so that only explain.c has to have sizeof(ExplainState) compiled into it. This costs one extra palloc() per EXPLAIN operation, which is surely pretty negligible.	2015-01-15 13:39:33 -05:00
Tom Lane	a5cd70dcbc	Improve performance of EXPLAIN with large range tables. As of 9.3, ruleutils.c goes to some lengths to ensure that table and column aliases used in its output are unique. Of course this takes more time than was required before, which in itself isn't fatal. However, EXPLAIN was set up so that recalculation of the unique aliases was repeated for each subexpression printed in a plan. That results in O(N^2) time and memory consumption for large plan trees, which did not happen in older branches. Fortunately, the expensive work is the same across a whole plan tree, so there is no need to repeat it; we can do most of the initialization just once per query and re-use it for each subexpression. This buys back most (not all) of the performance loss since 9.2. We need an extra ExplainState field to hold the precalculated deparse context. That's no problem in HEAD, but in the back branches, expanding sizeof(ExplainState) seems risky because third-party extensions might have local variables of that struct type. So, in 9.4 and 9.3, introduce an auxiliary struct to keep sizeof(ExplainState) the same. We should refactor the APIs to avoid such local variables in future, but that's material for a separate HEAD-only commit. Per gripe from Alexey Bashtanov. Back-patch to 9.3 where the issue was introduced.	2015-01-15 13:18:12 -05:00
Andres Freund	6cfd5086e1	Blindly try to fix a warning in s_lock.h when compiling with gcc on HPPA. The possibly, depending on compiler settings, generated warning was "warning: `S_UNLOCK' redefined". The hppa spinlock implementation doesn't follow the rules of s_lock.h and provides a gcc specific implementation outside of the the part of the file that's supposed to do that. It does so to avoid duplication between the HP compiler and gcc. That unfortunately means that S_UNLOCK is already defined when the HPPA specific section is reached. Undefine the generic fallback S_UNLOCK definition inside the HPPA section. That's far from pretty, but has the big advantage of being simple. If somebody is interested to fix this in a prettier way... This presumably got broken in the course of `0709b7ee72`. Discussion: 20150114225919.GY5245@awork2.anarazel.de Per complaint from Tom Lane.	2015-01-15 13:26:25 +01:00
Andres Freund	59f71a0d0b	Add a default local latch for use in signal handlers. To do so, move InitializeLatchSupport() into the new common process initialization functions, and add a new global variable MyLatch. MyLatch is usable as soon InitPostmasterChild() has been called (i.e. very early during startup). Initially it points to a process local latch that exists in all processes. InitProcess/InitAuxiliaryProcess then replaces that local latch with PGPROC->procLatch. During shutdown the reverse happens. This is primarily advantageous for two reasons: For one it simplifies dealing with the shared process latch, especially in signal handlers, because instead of having to check for MyProc, MyLatch can be used unconditionally. For another, a later patch that makes FEs/BE communication use latches, now can rely on the existence of a latch, even before having gone through InitProcess. Discussion: 20140927191243.GD5423@alap3.anarazel.de	2015-01-14 18:45:22 +01:00
Tom Lane	fd3d894e4e	Remove duplicate specification of -Ae for HP-UX C compiler. Autoconf has known about automatically selecting -Ae when needed for quite some time now, so remove the redundant addition in template/hpux. Noted while setting up buildfarm member pademelon.	2015-01-13 22:52:11 -05:00
Andres Freund	0139dea8f1	Remove some dead IsUnderPostmaster code from bootstrap.c. Since commit `626eb02198` has introduced the auxiliary process infrastructure, bootstrap_signals() was never used when forked from postmaster. Remove the IsUnderPostmaster specific code, and add a appropriate assertion.	2015-01-14 00:37:02 +01:00
Andres Freund	31c453165b	Commonalize process startup code. Move common code, that was duplicated in every postmaster child/every standalone process, into two functions in miscinit.c. Not only does that already result in a fair amount of net code reduction but it also makes it much easier to remove more duplication in the future. The prime motivation wasn't code deduplication though, but easier addition of new common code.	2015-01-14 00:33:14 +01:00
Andres Freund	2be82dcf17	Make logging_collector=on work with non-windows EXEC_BACKEND again. Commit `b94ce6e80` reordered postmaster's startup sequence so that the tempfile directory is only cleaned up after all the necessary state for pg_ctl is collected. Unfortunately the chosen location is after the syslogger has been started; which normally is fine, except for !WIN32 EXEC_BACKEND builds, which pass information to children via files in the temp directory. Move the call to RemovePgTempFiles() to just before the syslogger has started. That's the first child we fork. Luckily EXEC_BACKEND is pretty much only used by endusers on windows, which has a separate method to pass information to children. That means the real world impact of this bug is very small. Discussion: 20150113182344.GF12272@alap3.anarazel.de Backpatch to 9.1, just as the previous commit was.	2015-01-14 00:14:53 +01:00
Heikki Linnakangas	e922a13058	Spell the X072 feature correctly, was missing "with". Also use lower-case for a few more features, to be consistent with the others and with the SQL spec.	2015-01-13 16:08:55 +02:00
Andres Freund	14e8803f10	Add barriers to the latch code. Since their introduction latches have required barriers in SetLatch and ResetLatch - but when they were introduced there wasn't any barrier abstraction. Instead latches were documented to rely on the callsites to provide barrier semantics. Now that the barrier support looks halfway complete, add the necessary barriers to both latch implementations. Also remove a now superflous lock acquisition from syncrep.c and a superflous (and insufficient) barrier from freelist.c. There might be other cases that can now be simplified, but those are the only ones I've seen on a quick scan. We might want to backpatch this at some later point, but right now the barrier infrastructure in the backbranches isn't totally on par with master. Discussion: 20150112154026.GB2092@awork2.anarazel.de	2015-01-13 12:58:43 +01:00
Andres Freund	4bad60e3fd	Allow latches to wait for socket writability without waiting for readability. So far WaitLatchOrSocket() required to pass in WL_SOCKET_READABLE as that solely was used to indicate error conditions, like EOF. Waiting for WL_SOCKET_WRITEABLE would have meant to busy wait upon socket errors. Adjust the API to signal errors by returning the socket as readable, writable or both, depending on WL_SOCKET_READABLE/WL_SOCKET_WRITEABLE being specified. It would arguably be nicer to return WL_SOCKET_ERROR but that's not possible on platforms and would probably also result in more complex callsites. This previously had explicitly been forbidden in `e42a21b9e6`, as there was no strong use case at that point. We now are looking into making FE/BE communication use latches, so changing this makes sense. There also are some portability concerns because there cases of older platforms where select(2) is known to, in violation of POSIX, not return a socket as writable after the peer has closed it. So far the platforms where that's the case provide a working poll(2). If we find one where that's not the case, we'll need to add a workaround for that platform. Discussion: 20140927191243.GD5423@alap3.anarazel.de Reviewed-By: Heikki Linnakangas, Noah Misch	2015-01-13 12:58:43 +01:00
Heikki Linnakangas	3dfce37627	Fix typos in comment. Plus some tiny wordsmithing of not-quite-typos.	2015-01-13 10:32:38 +02:00
Tom Lane	7391e2513f	Fix some functions that were declared static then defined not-static. Per testing with a compiler that whines about this.	2015-01-12 16:08:43 -05:00
Tom Lane	5b3ce2c911	Avoid unexpected slowdown in vacuum regression test. I noticed the "vacuum" regression test taking really significantly longer than it used to on a slow machine. Investigation pointed the finger at commit `e415b469b3`, which added creation of an index using an extremely expensive index function. That function was evidently meant to be applied only twice ... but the test re-used an existing test table, which up till a couple lines before that had had over two thousand rows. Depending on timing of the concurrent regression tests, the intervening VACUUMs might have been unable to remove those recently-dead rows, and then the index build would need to create index entries for them too, leading to the wrap_do_analyze() function being executed 2000+ times not twice. Avoid this by using a different table that is guaranteed to have only the intended two rows in it. Back-patch to 9.0, like the commit that created the problem.	2015-01-12 15:13:53 -05:00
Alvaro Herrera	d126e1e95f	Tweak heapam's rmgr desc output slightly Some spaces were missing, and putting the affected tuple offset first in the lock cases instead of the locking data makes more sense. No backpatch since this is cosmetic and surrounding code has changed.	2015-01-12 16:09:16 -03:00
Alvaro Herrera	5c5ffee80f	Fix get_object_address argument type for extension statement Commit `3f88672a4` neglected to update the AlterExtensionContentsStmt production in the grammar to use TypeName to represent types when passing objects to get_object_address. Reported as a pg_upgrade failure by Jeff Janes.	2015-01-12 15:32:48 -03:00
Tom Lane	1f9bf05e53	Use correct text domain for errcontext() appearing within ereport(). The mechanism added in commit `dbdf9679d7` for associating the correct translation domain with errcontext strings potentially fails in cases where errcontext() is used within an ereport() macro. Such usage was not originally envisioned for errcontext(), but we do have a few places that do it. In this situation, the intended comma expression becomes just a couple of arguments to errfinish(), which the compiler might choose to evaluate right-to-left. Fortunately, in such cases the textdomain for the errcontext string must be the same as for the surrounding ereport. So we can fix this by letting errstart initialize context_domain along with domain; then it will have the correct value no matter which order the calls occur in. (Note that error stack callback functions are not invoked until errfinish, so normal usage of errcontext won't affect what happens for errcontext calls within the ereport macro.) In passing, make sure that errcontext calls within the main backend set context_domain to something non-NULL. This isn't a live bug because NULL would select the current textdomain() setting which should be the right thing anyway --- but it seems better to handle this completely consistently with the regular domain field. Per report from Dmitry Voronin. Backpatch to 9.3; before that, there wasn't any attempt to ensure that errcontext strings were translated in an appropriate domain.	2015-01-12 12:40:29 -05:00
Stephen Frost	1bf4a84d0f	Skip dead backends in MinimumActiveBackends Back in `ed0b409`, PGPROC was split and moved to static variables in procarray.c, with procs in ProcArrayStruct replaced by an array of integers representing process numbers (pgprocnos), with -1 indicating a dead process which has yet to be removed. Access to procArray is generally done under ProcArrayLock and therefore most code does not have to concern itself with -1 entries. However, MinimumActiveBackends intentionally does not take ProcArrayLock, which means it has to be extra careful when accessing procArray. Prior to `ed0b409`, this was handled by checking for a NULL in the pointer array, but that check was no longer valid after the split. Coverity pointed out that the check could never happen and so it was removed in `5592eba`. That didn't make anything worse, but it didn't fix the issue either. The correct fix is to check for pgprocno == -1 and skip over that entry if it is encountered. Back-patch to 9.2, since there can be attempts to access the arrays prior to their start otherwise. Note that the changes prior to 9.4 will look a bit different due to the change in `5592eba`. Note that MinimumActiveBackends only returns a bool for heuristic purposes and any pre-array accesses are strictly read-only and so there is no security implication and the lack of fields complaints indicates it's very unlikely to run into issues due to this. Pointed out by Noah.	2015-01-12 11:31:57 -05:00
Tom Lane	44096f1c66	Fix portability breakage in pg_dump. Commit `0eea8047bf` introduced some overly optimistic assumptions about what could be in a local struct variable's initializer. (This might in fact be valid code according to C99, but I've got at least one pre-C99 compiler that falls over on those nonconstant address expressions.) There is no reason whatsoever for main()'s workspace to not be static, so revert long_options[] to a static and make the DumpOptions struct static as well.	2015-01-11 13:28:26 -05:00
Tom Lane	8883bae33b	Remove configure test for nonstandard variants of getpwuid_r(). We had code that supposed that some platforms might offer a nonstandard version of getpwuid_r() with only four arguments. However, the 5-argument definition has been standardized at least since the Single Unix Spec v2, which is our normal reference for what's portable across all Unix-oid platforms. (What's more, this wasn't the only pre-standardization version of getpwuid_r(); my old HPUX 10.20 box has still another signature.) So let's just get rid of the now-useless configure step.	2015-01-11 12:52:37 -05:00
Tom Lane	080eabe2e8	Fix libpq's behavior when /etc/passwd isn't readable. Some users run their applications in chroot environments that lack an /etc/passwd file. This means that the current UID's user name and home directory are not obtainable. libpq used to be all right with that, so long as the database role name to use was specified explicitly. But commit `a4c8f14364` broke such cases by causing any failure of pg_fe_getauthname() to be treated as a hard error. In any case it did little to advance its nominal goal of causing errors in pg_fe_getauthname() to be reported better. So revert that and instead put some real error-reporting code in place. This requires changes to the APIs of pg_fe_getauthname() and pqGetpwuid(), since the latter had departed from the POSIX-specified API of getpwuid_r() in a way that made it impossible to distinguish actual lookup errors from "no such user". To allow such failures to be reported, while not failing if the caller supplies a role name, add a second call of pg_fe_getauthname() in connectOptions2(). This is a tad ugly, and could perhaps be avoided with some refactoring of PQsetdbLogin(), but I'll leave that idea for later. (Note that the complained-of misbehavior only occurs in PQsetdbLogin, not when using the PQconnect functions, because in the latter we will never bother to call pg_fe_getauthname() if the user gives a role name.) In passing also clean up the Windows-side usage of GetUserName(): the recommended buffer size is 257 bytes, the passed buffer length should be the buffer size not buffer size less 1, and any error is reported by GetLastError() not errno. Per report from Christoph Berg. Back-patch to 9.4 where the chroot failure case was introduced. The generally poor reporting of errors here is of very long standing, of course, but given the lack of field complaints about it we won't risk changing these APIs further back (even though they're theoretically internal to libpq).	2015-01-11 12:35:44 -05:00
Andres Freund	de6429a8fd	Provide a generic fallback for pg_compiler_barrier using an extern function. If the compiler/arch combination does not provide compiler barriers, provide a fallback. That fallback simply consists out of a function call into a externally defined function. That should guarantee compiler barrierer semantics except for compilers that do inter translation unit/global optimization - those better provide an actual compiler barrier. Hopefully this fixes Tom's report of linker failures due to pg_compiler_barrier_impl not being provided. I'm not backpatching this commit as it builds on the new atomics infrastructure. If we decide an equivalent fix needs to be backpatched, I'll do so in a separate commit. Discussion: 27746.1420930690@sss.pgh.pa.us Per report from Tom Lane.	2015-01-11 01:15:29 +01:00
Andres Freund	db4ec2ffce	Fix alignment of pg_atomic_uint64 variables on some 32bit platforms. I failed to recognize that pg_atomic_uint64 wasn't guaranteed to be 8 byte aligned on some 32bit platforms - which it has to be on some platforms to guarantee the desired atomicity and which we assert. As this is all compiler specific code anyway we can just rely on compiler specific tricks to enforce alignment. I've been unable to find concrete documentation about the version that introduce the sunpro alignment support, so that might need additional guards. I've verified that this works with gcc x86 32bit, but I don't have access to any other 32bit environment. Discussion: op.xpsjdkil0sbe7t@vld-kuci Per report from Vladimir Koković.	2015-01-11 01:06:37 +01:00
Stephen Frost	c4fda14845	Fix typo in execMain.c Wee -> We. Pointed out by Etsuro Fujita.	2015-01-09 11:07:35 -05:00
Alvaro Herrera	045c68ad21	xlogreader.c: Fix report_invalid_record translatability flag For some reason I overlooked in GETTEXT_TRIGGERS that the right argument be read by gettext in `7fcbf6a405`. This will drop the translation percentages for the backend all the way back to 9.3 ... Problem reported by Heikki.	2015-01-09 12:34:25 -03:00
Stephen Frost	c219cbfed3	Move rowsecurity event trigger test The event trigger test for rowsecurity can cause problems for other tests which are run in parallel with it. Instead of running that test in the rowsecurity set, move it to the event_trigger set, which runs isolated from other tests. Also reverts `7161b08`, which moved rowsecurity into its own test group. That's no longer necessary, now that the event trigger test is gone from the rowsecurity set of tests. Pointed out by Tom.	2015-01-08 14:14:14 -05:00
Andres Freund	f454144a34	Remove comment that was intended to have been removed before commit. Noticed by Amit Kapila	2015-01-08 13:16:31 +01:00
Andres Freund	93be095007	Move comment about sun cc's __machine_rw_barrier being a full barrier. I'd accidentally written the comment besides the read barrier, instead of the full barrier, implementation. Noticed by Oskari Saarenmaa	2015-01-08 13:08:05 +01:00
Andres Freund	17eaae9897	Fix logging of pages skipped due to pins during vacuum. The new logging introduced in `35192f06` made the incorrect assumption that scan_all vacuums would always wait for buffer pins; but they only do so if the page actually needs to be frozen. Fix that inaccuracy by removing the difference in log output based on scan_all and just always remove the same message. I chose to keep the split log message from the original commit for now, it seems likely that it'll be of use in the future. Also merge the line about buffer pins in autovacuum's log output into the existing "pages: ..." line. It seems odd to have a separate line about pins, without the "topic: " prefix others have. Also rename the new 'pinned_pages' variable to 'pinskipped_pages' because it actually tracks the number of pages that could not be pinned. Discussion: 20150104005324.GC9626@awork2.anarazel.de	2015-01-08 12:57:09 +01:00
Noah Misch	2048e5b881	On Darwin, refuse postmaster startup when multithreaded. The previous commit introduced its report at LOG level to avoid surprises at minor release upgrade time. Compel users deploying the next major release to also deploy the reported workaround.	2015-01-07 22:46:59 -05:00
Noah Misch	894459e59f	On Darwin, detect and report a multithreaded postmaster. Darwin --enable-nls builds use a substitute setlocale() that may start a thread. Buildfarm member orangutan experienced BackendList corruption on account of different postmaster threads executing signal handlers simultaneously. Furthermore, a multithreaded postmaster risks undefined behavior from sigprocmask() and fork(). Emit LOG messages about the problem and its workaround. Back-patch to 9.0 (all supported versions).	2015-01-07 22:35:44 -05:00
Noah Misch	6fdba8ceb0	Always set the six locale category environment variables in main(). Typical server invocations already achieved that. Invalid locale settings in the initial postmaster environment interfered, as could malloc() failure. Setting "LC_MESSAGES=pt_BR.utf8 LC_ALL=invalid" in the postmaster environment will now choose C-locale messages, not Brazilian Portuguese messages. Most localized programs, including all PostgreSQL frontend executables, do likewise. Users are unlikely to observe changes involving locale categories other than LC_MESSAGES. CheckMyDatabase() ensures that we successfully set LC_COLLATE and LC_CTYPE; main() sets the remaining three categories to locale "C", which almost cannot fail. Back-patch to 9.0 (all supported versions).	2015-01-07 22:34:57 -05:00
Noah Misch	e415b469b3	Reject ANALYZE commands during VACUUM FULL or another ANALYZE. vacuum()'s static variable handling makes it non-reentrant; an ensuing null pointer deference crashed the backend. Back-patch to 9.0 (all supported versions).	2015-01-07 22:33:58 -05:00
Heikki Linnakangas	1e78d81e88	Don't open a WAL segment for writing at end of recovery. Since commit `ba94518a`, we used XLogFileOpen to open the next segment for writing, but if the end-of-recovery happens exactly at a segment boundary, the new segment might not exist yet. (Before `ba94518a`, XLogFileOpen was correct, because we would open the previous segment if the switch happened at the boundary.) Instead of trying to create it if necessary, it's simpler to not bother opening the segment at all. XLogWrite() will open or create it soon anyway, after writing the checkpoint or end-of-recovery record. Reported by Andres Freund.	2015-01-07 16:20:20 +02:00
Peter Eisentraut	79af9a1d26	Fix namespace handling in xpath function Previously, the xml value resulting from an xpath query would not have namespace declarations if the namespace declarations were attached to an ancestor element in the input xml value. That means the output value was not correct XML. Fix that by running the result value through xmlCopyNode(), which produces the correct namespace declarations. Author: Ali Akbar <the.apaan@gmail.com>	2015-01-06 23:06:13 -05:00
Andres Freund	3fabed0705	Correctly handle relcache invalidation corner case during logical decoding. When using a historic snapshot for logical decoding it can validly happen that a relation that's in the relcache isn't visible to that historic snapshot. E.g. if a newly created relation is referenced in the query that uses the SQL interface for logical decoding and a sinval reset occurs. The earlier commit that fixed the error handling for that corner case already improves the situation as a ERROR is better than hitting an assertion... But it's obviously not good enough. So additionally allow that case without an error if a historic snapshot is set up - that won't allow an invalid entry to stay in the cache because it's a) already marked invalid and will thus be rebuilt during the next access b) the syscaches will be reset at the end of decoding. There might be prettier solutions to handle this case, but all that we could think of so far end up being much more complex than this quite simple fix. This fixes the assertion failures reported by the buildfarm (markhor, tick, leech) after the introduction of new regression tests in `89fd41b390`. The failure there weren't actually directly caused by CLOBBER_CACHE_ALWAYS but the extraordinary long runtimes due to it lead to sinval resets triggering the behaviour. Discussion: 22459.1418656530@sss.pgh.pa.us Backpatch to 9.4 where logical decoding was introduced.	2015-01-07 00:19:37 +01:00
Andres Freund	31912d01d8	Improve relcache invalidation handling of currently invisible relations. The corner case where a relcache invalidation tried to rebuild the entry for a referenced relation but couldn't find it in the catalog wasn't correct. The code tried to RelationCacheDelete/RelationDestroyRelation the entry. That didn't work when assertions are enabled because the latter contains an assertion ensuring the refcount is zero. It's also more generally a bad idea, because by virtue of being referenced somebody might actually look at the entry, which is possible if the error is trapped and handled via a subtransaction abort. Instead just error out, without deleting the entry. As the entry is marked invalid, the worst that can happen is that the invalid (and at some point unused) entry lingers in the relcache. Discussion: 22459.1418656530@sss.pgh.pa.us There should be no way to hit this case < 9.4 where logical decoding introduced a bug that can hit this. But since the code for handling the corner case is there it should do something halfway sane, so backpatch all the the way back. The logical decoding bug will be handled in a separate commit.	2015-01-07 00:18:00 +01:00
Bruce Momjian	cb075178ec	Document that Perl's Tie might add a trailing newline Report by Stefan Kaltenbrunner	2015-01-06 15:52:15 -05:00
Alvaro Herrera	91539c5698	Fix thinko in plpython error message	2015-01-06 15:16:29 -03:00
Bruce Momjian	29c18d919e	Clarify which files need manual copyright updates	2015-01-06 12:53:15 -05:00
Bruce Momjian	338c10b7f9	Simplify post-copyright update instructions.	2015-01-06 11:45:17 -05:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Tom Lane	adfc157dd9	Fix broken pg_dump code for dumping comments on event triggers. This never worked, I think. Per report from Marc Munro. In passing, fix funny spacing in the COMMENT ON command as a result of excess space in the "label" string.	2015-01-05 19:27:04 -05:00
Andres Freund	3c9e4cdbf2	Fix oversight in recent pg_basebackup fix causing pg_receivexlog failures. A oversight in `2c0a485896` causes 'could not create archive status file "...": No such file or directory' errors in pg_receivexlog if the target directory doesn't happen to contain a archive_status directory. That's due to a stupidly left over 'true' constant instead of mark_done being passed down to ProcessXLogDataMsg(). The bug is only present in the master branch, and luckily wasn't released. Spotted by Fujii Masao.	2015-01-05 12:31:05 +01:00
Fujii Masao	9f1d7313aa	Fix typo in comment. Report by Amit Kapila	2015-01-05 16:35:26 +09:00
Alvaro Herrera	d5e3d1e969	Fix thinko in lock mode enum Commit `0e5680f473` contained a thinko mixing LOCKMODE with LockTupleMode. This caused misbehavior in the case where a tuple is marked with a multixact with at most a FOR SHARE lock, and another transaction tries to acquire a FOR NO KEY EXCLUSIVE lock; this case should block but doesn't. Include a new isolation tester spec file to explicitely try all the tuple lock combinations; without the fix it shows the problem: starting permutation: s1_begin s1_lcksvpt s1_tuplock2 s2_tuplock3 s1_commit step s1_begin: BEGIN; step s1_lcksvpt: SELECT * FROM multixact_conflict FOR KEY SHARE; SAVEPOINT foo; a 1 step s1_tuplock2: SELECT * FROM multixact_conflict FOR SHARE; a 1 step s2_tuplock3: SELECT * FROM multixact_conflict FOR NO KEY UPDATE; a 1 step s1_commit: COMMIT; With the fixed code, step s2_tuplock3 blocks until session 1 commits, which is the correct behavior. All other cases behave correctly. Backpatch to 9.3, like the commit that introduced the problem.	2015-01-04 15:48:29 -03:00
Andres Freund	2ea95959af	Add error handling for failing fstat() calls in copy.c. These calls are pretty much guaranteed not to fail unless something has gone horribly wrong, and even in that case we'd just error out a short time later. But since several code checkers complain about the missing check it seems worthwile to fix it nonetheless. Pointed out by Coverity.	2015-01-04 16:47:23 +01:00
Andres Freund	14570c2828	Remove superflous variable from xlogreader's XLogFindNextRecord(). Pointed out by Coverity. Since this is mere, and debatable, cosmetics I'm not backpatching this.	2015-01-04 15:35:46 +01:00
Andres Freund	0398ece4c5	Fix inconsequential fd leak in the new mark_file_as_archived() function. As every error in mark_file_as_archived() will lead to a failure of pg_basebackup the FD leak couldn't ever lead to a real problem. It seems better to fix the leak anyway though, rather than silence Coverity, as the usage of the function might get extended or copied at some point in the future. Pointed out by Coverity. Backpatch to 9.2, like the relevant part of the previous patch.	2015-01-04 14:36:21 +01:00
Andres Freund	2c0a485896	Prevent WAL files created by pg_basebackup -x/X from being archived again. WAL (and timeline history) files created by pg_basebackup did not maintain the new base backup's archive status. That's currently not a problem if the new node is used as a standby - but if that node is promoted all still existing files can get archived again. With a high wal_keep_segment settings that can happen a significant time later - which is quite confusing. Change both the backend (for the -x/-X fetch case) and pg_basebackup (for -X stream) itself to always mark WAL/timeline files included in the base backup as .done. That's in line with walreceiver.c doing so. The verbosity of the pg_basebackup changes show pretty clearly that it needs some refactoring, but that'd result in not be backpatchable changes. Backpatch to 9.1 where pg_basebackup was introduced. Discussion: 20141205002854.GE21964@awork2.anarazel.de	2015-01-03 20:54:12 +01:00
Andres Freund	ccb161b66a	Add pg_string_endswith as the start of a string helper library in src/common. Backpatch to 9.3 where src/common was introduce, because a bugfix that needs to be backpatched, requires the function. Earlier branches will have to duplicate the code.	2015-01-03 20:54:12 +01:00
Tom Lane	d6657d2a10	Treat negative values of recovery_min_apply_delay as having no effect. At one point in the development of this feature, it was claimed that allowing negative values would be useful to compensate for timezone differences between master and slave servers. That was based on a mistaken assumption that commit timestamps are recorded in local time; but of course they're in UTC. Nor is a negative apply delay likely to be a sane way of coping with server clock skew. However, the committed patch still treated negative delays as doing something, and the timezone misapprehension survived in the user documentation as well. If recovery_min_apply_delay were a proper GUC we'd just set the minimum allowed value to be zero; but for the moment it seems better to treat negative settings as if they were zero. In passing do some extra wordsmithing on the parameter's documentation, including correcting a second misstatement that the parameter affects processing of Restore Point records. Issue noted by Michael Paquier, who also provided the code patch; doc changes by me. Back-patch to 9.4 where the feature was introduced.	2015-01-03 13:14:03 -05:00
Tom Lane	7161b082bd	Don't run rowsecurity in parallel with other regression tests. The short-lived event trigger in the rowsecurity test causes irreproducible failures when the concurrent tests do something that the event trigger can't cope with. Per buildfarm.	2014-12-31 17:04:27 -05:00
Tom Lane	a486841eb1	Print more information about getObjectIdentityParts() failures. This might help us debug what's happening on some buildfarm members. In passing, reduce the message from ereport to elog --- it doesn't seem like this should be a user-facing case, so not worth translating.	2014-12-31 14:44:43 -05:00
Tom Lane	28551797a4	Improve consistency of parsing of psql's magic variables. For simple boolean variables such as ON_ERROR_STOP, psql has for a long time recognized variant spellings of "on" and "off" (such as "1"/"0"), and it also made a point of warning you if you'd misspelled the setting. But these conveniences did not exist for other keyword-valued variables. In particular, though ECHO_HIDDEN and ON_ERROR_ROLLBACK include "on" and "off" as possible values, none of the alternative spellings for those were recognized; and to make matters worse the code would just silently assume "on" was meant for any unrecognized spelling. Several people have reported getting bitten by this, so let's fix it. In detail, this patch: * Allows all spellings recognized by ParseVariableBool() for ECHO_HIDDEN and ON_ERROR_ROLLBACK. * Reports a warning for unrecognized values for COMP_KEYWORD_CASE, ECHO, ECHO_HIDDEN, HISTCONTROL, ON_ERROR_ROLLBACK, and VERBOSITY. * Recognizes all values for all these variables case-insensitively; previously there was a mishmash of case-sensitive and case-insensitive behaviors. Back-patch to all supported branches. There is a small risk of breaking existing scripts that were accidentally failing to malfunction; but the consensus is that the chance of detecting real problems and preventing future mistakes outweighs this.	2014-12-31 12:18:50 -05:00
Alvaro Herrera	ba66c9d068	Add missing pstrdup calls The one for the OCLASS_COLLATION case was noticed by CLOBBER_CACHE_ALWAYS buildfarm members; the others I spotted by manual code inspection. Also remove a redundant check.	2014-12-31 13:19:40 -03:00
Robert Haas	c168c88577	Don't tab-complete COMMENT ON ... IS with IS. Ian Barwick	2014-12-31 11:06:43 -05:00
Alvaro Herrera	72dd233d3e	pg_event_trigger_dropped_objects: Add name/args output columns These columns can be passed to pg_get_object_address() and used to reconstruct the dropped objects identities in a remote server containing similar objects, so that the drop can be replicated. Reviewed by Stephen Frost, Heikki Linnakangas, Abhijit Menon-Sen, Andres Freund.	2014-12-30 17:41:46 -03:00
Alvaro Herrera	a676201490	Add pg_identify_object_as_address This function returns object type and objname/objargs arrays, which can be passed to pg_get_object_address. This is especially useful because the textual representation can be copied to a remote server in order to obtain the corresponding OID-based address. In essence, this function is the inverse of recently added pg_get_object_address(). Catalog version bumped due to the addition of the new function. Also add docs to pg_get_object_address.	2014-12-30 15:41:50 -03:00
Alvaro Herrera	5b447ad3a9	Fix object_address expected output Per pink buildfarm	2014-12-30 15:04:21 -03:00
Alvaro Herrera	3f88672a4e	Use TypeName to represent type names in certain commands In COMMENT, DROP, SECURITY LABEL, and the new pg_get_object_address function, we were representing types as a list of names, same as other objects; but types are special objects that require their own representation to be totally accurate. In the original COMMENT code we had a note about fixing it which was lost in the course of `c10575ff00`. Change all those places to use TypeName instead, as suggested by that comment. Right now the original coding doesn't cause any bugs, so no backpatch. It is more problematic for proposed future code that operate with object addresses from the SQL interface; type details such as array-ness are lost when working with the degraded representation. Thanks to Petr Jelínek and Dimitri Fontaine for offlist help on finding a solution to a shift/reduce grammar conflict.	2014-12-30 13:57:23 -03:00
Heikki Linnakangas	930fd68455	Revert the GinMaxItemSize calculation so that we fit 3 tuples per page. Commit `36a35c55` changed the divisor from 3 to 6, for no apparent reason. Reducing GinMaxItemSize like that created a dump/reload hazard: loading a 9.3 database to 9.4 might fail with "index row size XXX exceeds maximum 1352 for index ..." error. Revert the change. While we're at it, make the calculation slightly more accurate. It used to divide the available space on page by three, then subtract sizeof(ItemIdData), and finally round down. That's not totally accurate; the item pointers for the three items are packed tight right after the page header, but there is alignment padding after the item pointers. Change the calculation to reflect that, like BTMaxItemSize does. I tested this with different block sizes on systems with 4- and 8-byte alignment, and the value after the final MAXALIGN_DOWN was the same with both methods on all configurations. So this does not make any difference currently, but let's be tidy. Also add a comment explaining what the macro does. This fixes bug #12292 reported by Robert Thaler. Backpatch to 9.4, where the bug was introduced.	2014-12-30 14:53:11 +02:00
Tom Lane	9a11df1449	Remove duplicate assignment in new pg_get_object_address() function. Noted by Coverity.	2014-12-28 12:03:32 -05:00
Alvaro Herrera	6630420fc9	Restrict name list len for domain constraints This avoids an ugly-looking "cache lookup failure" message. Ugliness pointed out by Andres Freund.	2014-12-26 14:31:37 -03:00
Alvaro Herrera	289121a452	Remove event trigger from object_address test It is causing trouble when run in parallel mode, because dropping the function other sessions are running concurrently causes them to fail due to inability to find the function. Per buildfarm, as noted by Tom Lane.	2014-12-26 14:18:09 -03:00
Alvaro Herrera	0e5680f473	Grab heavyweight tuple lock only before sleeping We were trying to acquire the lock even when we were subsequently not sleeping in some other transaction, which opens us up unnecessarily to deadlocks. In particular, this is troublesome if an update tries to lock an updated version of a tuple and finds itself doing EvalPlanQual update chain walking; more than two sessions doing this concurrently will find themselves sleeping on each other because the HW tuple lock acquisition in heap_lock_tuple called from EvalPlanQualFetch races with the same tuple lock being acquired in heap_update -- one of these sessions sleeps on the other one to finish while holding the tuple lock, and the other one sleeps on the tuple lock. Per trouble report from Andrew Sackville-West in http://www.postgresql.org/message-id/20140731233051.GN17765@andrew-ThinkPad-X230 His scenario can be simplified down to a relatively simple isolationtester spec file which I don't include in this commit; the reason is that the current isolationtester is not able to deal with more than one blocked session concurrently and it blocks instead of raising the expected deadlock. In the future, if we improve isolationtester, it would be good to include the spec file in the isolation schedule. I posted it in http://www.postgresql.org/message-id/20141212205254.GC1768@alvh.no-ip.org Hat tip to Mark Kirkwood, who helped diagnose the trouble.	2014-12-26 13:52:27 -03:00
Noah Misch	8d9cb0bc48	Have config_sspi_auth() permit IPv6 localhost connections. Windows versions later than Windows Server 2003 map "localhost" to ::1. Account for that in the generated pg_hba.conf, fixing another oversight in commit `f6dc6dd5ba`. Back-patch to 9.0, like that commit. David Rowley and Noah Misch	2014-12-25 13:52:03 -05:00
Andres Freund	740a4ec7f4	Blindly fix a dtrace probe in lwlock.c for a removed local variable. Per buildfarm member locust.	2014-12-25 19:48:46 +01:00
Tom Lane	966115c305	Temporarily revert "Move pg_lzcompress.c to src/common." This reverts commit `60838df922`. That change needs a bit more thought to be workable. In view of the potentially machine-dependent stuff that went in today, we need all of the buildfarm to be testing those other changes.	2014-12-25 13:22:55 -05:00
Andres Freund	d72731a704	Lockless StrategyGetBuffer clock sweep hot path. StrategyGetBuffer() has proven to be a bottleneck in a number of buffer acquisition heavy workloads. To some degree this has already been alleviated by `5d7962c6`, but it still can be quite a heavy bottleneck. The problem is that in unfortunate usage patterns a single StrategyGetBuffer() call will have to look at a large number of buffers - in turn making it likely that the process will be put to sleep while still holding the spinlock. Replace most of the usage of the buffer_strategy_lock spinlock for the clock sweep by a atomic nextVictimBuffer variable. That variable, modulo NBuffers, is the current hand of the clock sweep. The buffer clock-sweep then only needs to acquire the spinlock after a wraparound. And even then only in the process that did the wrapping around. That alleviates nearly all the contention on the relevant spinlock, although significant contention on the cacheline can still exist. Reviewed-By: Robert Haas and Amit Kapila Discussion: 20141010160020.GG6670@alap3.anarazel.de, 20141027133218.GA2639@awork2.anarazel.de	2014-12-25 18:26:25 +01:00
Andres Freund	ab5194e6f6	Improve LWLock scalability. The old LWLock implementation had the problem that concurrent lock acquisitions required exclusively acquiring a spinlock. Often that could lead to acquirers waiting behind the spinlock, even if the actual LWLock was free. The new implementation doesn't acquire the spinlock when acquiring the lock itself. Instead the new atomic operations are used to atomically manipulate the state. Only the waitqueue, used solely in the slow path, is still protected by the spinlock. Check lwlock.c's header for an explanation about the used algorithm. For some common workloads on larger machines this can yield significant performance improvements. Particularly in read mostly workloads. Reviewed-By: Amit Kapila and Robert Haas Author: Andres Freund Discussion: 20130926225545.GB26663@awork2.anarazel.de	2014-12-25 17:24:30 +01:00
Andres Freund	7882c3b0b9	Convert the PGPROC->lwWaitLink list into a dlist instead of open coding it. Besides being shorter and much easier to read it changes the logic in LWLockRelease() to release all shared lockers when waking up any. This can yield some significant performance improvements - and the fairness isn't really much worse than before, as we always allowed new shared lockers to jump the queue.	2014-12-25 17:24:30 +01:00
Andres Freund	570bd2b3fd	Add capability to suppress CONTEXT: messages to elog machinery. Hiding context messages usually is not a good idea - except for rather verbose debugging/development utensils like LOG_DEBUG. There the amount of repeated context messages just bloat the log without adding information.	2014-12-25 17:24:30 +01:00
Fujii Masao	4a5593197b	Remove duplicate include of slot.h. Back-patch to 9.4, where this problem was added.	2014-12-25 22:47:53 +09:00
Fujii Masao	60838df922	Move pg_lzcompress.c to src/common. Exposing compression and decompression APIs of pglz makes possible its use by extensions and contrib modules. pglz_decompress contained a call to elog to emit an error message in case of corrupted data. This function is changed to return a status code to let its callers return an error instead. This commit is required for upcoming WAL compression feature so that the WAL reader facility can decompress the WAL data by using pglz_decompress. Michael Paquier	2014-12-25 20:46:14 +09:00
Tom Lane	5b89473d87	Add CST (China Standard Time) to our lists of timezone abbreviations. For some reason this seems to have been missed when the lists in src/timezone/tznames/ were first constructed. We can't put it in Default because of the conflict with US CST, but we should certainly list it among the alternative entries in Asia.txt. (I checked for other oversights, but all the other abbreviations that are in current use according to the IANA files seem to be accounted for.) Noted while responding to bug #12326.	2014-12-24 16:35:23 -05:00
Andrew Dunstan	3f37b6c316	Fix installcheck case for tap tests	2014-12-24 10:31:36 -05:00
Fujii Masao	3b6ca123b5	Remove unused fields from ReindexStmt. `fe263d1` changed the REINDEX logic so that those fields are not used at all, but forgot to remove them. Sawada Masahiko	2014-12-24 21:40:47 +09:00
Andres Freund	cd5ebe1edd	Suppress MSVC warning in typeStringToTypeName function. MSVC doesn't realize ereport(ERROR) doesn't return. David Rowley	2014-12-24 12:30:08 +01:00
Tom Lane	3e22753559	Remove failing collation case from object_address regression test. Per buildfarm, this test case does not yield consistent results. I don't think it's useful enough to figure out a workaround, either.	2014-12-23 16:55:51 -05:00
Alvaro Herrera	a609d96778	Revert "Use a bitmask to represent role attributes" This reverts commit `1826987a46`. The overall design was deemed unacceptable, in discussion following the previous commit message; we might find some parts of it still salvageable, but I don't want to be on the hook for fixing it, so let's wait until we have a new patch.	2014-12-23 15:35:49 -03:00
Alvaro Herrera	d7ee82e50f	Add SQL-callable pg_get_object_address This allows access to get_object_address from SQL, which is useful to obtain OID addressing information from data equivalent to that emitted by the parser. This is necessary infrastructure of a project to let replication systems propagate object dropping events to remote servers, where the schema might be different than the server originating the DROP. This patch also adds support for OBJECT_DEFAULT to get_object_address; that is, it is now possible to refer to a column's default value. Catalog version bumped due to the new function. Reviewed by Stephen Frost, Heikki Linnakangas, Robert Haas, Andres Freund, Abhijit Menon-Sen, Adam Brightwell.	2014-12-23 15:31:29 -03:00
Alvaro Herrera	1826987a46	Use a bitmask to represent role attributes The previous representation using a boolean column for each attribute would not scale as well as we want to add further attributes. Extra auxilliary functions are added to go along with this change, to make up for the lost convenience of access of the old representation. Catalog version bumped due to change in catalogs and the new functions. Author: Adam Brightwell, minor tweaks by Álvaro Reviewed by: Stephen Frost, Andres Freund, Álvaro Herrera	2014-12-23 10:22:09 -03:00
Alvaro Herrera	7eca575d1c	get_object_address: separate domain constraints from table constraints Apart from enabling comments on domain constraints, this enables a future project to replicate object dropping to remote servers: with the current mechanism there's no way to distinguish between the two types of constraints, so there's no way to know what to drop. Also added support for the domain constraint comments in psql's \dd and pg_dump. Catalog version bumped due to the change in ObjectType enum.	2014-12-23 09:06:44 -03:00
Peter Eisentraut	584e35d17c	Change local_preload_libraries to PGC_USERSET This allows it to be used with ALTER ROLE SET. Although the old setting of PGC_BACKEND prevented changes after session start, after discussion it was more useful to allow ALTER ROLE SET instead and just document that changes during a session have no effect. This is similar to how session_preload_libraries works already. An alternative would be to change things to allow PGC_BACKEND and PGC_SU_BACKEND settings to be changed by ALTER ROLE SET. But that might need further research (e.g., log_connections would probably not work). based on patch by Kyotaro Horiguchi	2014-12-22 23:05:46 -05:00
Heikki Linnakangas	955557ddcc	Move rbtree.c from src/backend/utils/misc to src/backend/lib. We have other general-purpose data structures in src/backend/lib, so it seems like a better home for the red-black tree as well.	2014-12-22 17:52:08 +02:00
Heikki Linnakangas	e7032610f7	Use a pairing heap for the priority queue in kNN-GiST searches. This performs slightly better, uses less memory, and needs slightly less code in GiST, than the Red-Black tree previously used. Reviewed by Peter Geoghegan	2014-12-22 12:05:57 +02:00
Heikki Linnakangas	2ef6c66a2b	Fix file descriptor leak at end of recovery. XLogFileInit() returns a file descriptor, which needs to be closed. The leak was short-lived, since the startup process exits shortly afterwards, but it was clearly a bug, nevertheless. Per Coverity report.	2014-12-21 21:51:59 +02:00
Alvaro Herrera	0ee98d1cbf	pg_event_trigger_dropped_objects: add behavior flags Add "normal" and "original" flags as output columns to the pg_event_trigger_dropped_objects() function. With this it's possible to distinguish which objects, among those listed, need to be explicitely referenced when trying to replicate a deletion. This is necessary so that the list of objects can be pruned to the minimum necessary to replicate the DROP command in a remote server that might have slightly different schema (for instance, TOAST tables and constraints with different names and such.) Catalog version bumped due to change of function definition. Reviewed by: Abhijit Menon-Sen, Stephen Frost, Heikki Linnakangas, Robert Haas.	2014-12-19 15:00:45 -03:00
Heikki Linnakangas	5c805d0a81	Fix timestamp in end-of-recovery WAL records. We used time(null) to set a TimestampTz field, which gave bogus results. Noticed while looking at pg_xlogdump output. Backpatch to 9.3 and above, where the fast promotion was introduced.	2014-12-19 17:04:20 +02:00
Andres Freund	37de8de9e3	Prevent potentially hazardous compiler/cpu reordering during lwlock release. In LWLockRelease() (and in 9.4+ LWLockUpdateVar()) we release enqueued waiters using PGSemaphoreUnlock(). As there are other sources of such unlocks backends only wake up if MyProc->lwWaiting is set to false; which is only done in the aforementioned functions. Before this commit there were dangers because the store to lwWaitLink could become visible before the store to lwWaitLink. This could both happen due to compiler reordering (on most compilers) and on some platforms due to the CPU reordering stores. The possible consequence of this is that a backend stops waiting before lwWaitLink is set to NULL. If that backend then tries to acquire another lock and has to wait there the list could become corrupted once the lwWaitLink store is finally performed. Add a write memory barrier to prevent that issue. Unfortunately the barrier support has been only added in 9.2. Given that the issue has not knowingly been observed in praxis it seems sufficient to prohibit compiler reordering using volatile for 9.0 and 9.1. Actual problems due to compiler reordering are more likely anyway. Discussion: 20140210134625.GA15246@awork2.anarazel.de	2014-12-19 14:29:52 +01:00
Andres Freund	9959abb012	Define Assert() et al to ((void)0) to avoid pedantic warnings. gcc's -Wempty-body warns about the current usage when compiling postgres without --enable-cassert.	2014-12-19 14:27:45 +01:00
Alvaro Herrera	cd6e66572b	Use %u to print out BlockNumber variables Per Tom Lane	2014-12-18 17:59:00 -03:00
Alvaro Herrera	35192f0626	Have VACUUM log number of skipped pages due to pins Author: Jim Nasby, some kibitzing by Heikki Linnankangas. Discussion leading to current behavior and precise wording fueled by thoughts from Robert Haas and Andres Freund.	2014-12-18 17:18:33 -03:00
Tom Lane	4a14f13a0a	Improve hash_create's API for selecting simple-binary-key hash functions. Previously, if you wanted anything besides C-string hash keys, you had to specify a custom hashing function to hash_create(). Nearly all such callers were specifying tag_hash or oid_hash; which is tedious, and rather error-prone, since a caller could easily miss the opportunity to optimize by using hash_uint32 when appropriate. Replace this with a design whereby callers using simple binary-data keys just specify HASH_BLOBS and don't need to mess with specific support functions. hash_create() itself will take care of optimizing when the key size is four bytes. This nets out saving a few hundred bytes of code space, and offers a measurable performance improvement in tidbitmap.c (which was not exploiting the opportunity to use hash_uint32 for its 4-byte keys). There might be some wins elsewhere too, I didn't analyze closely. In future we could look into offering a similar optimized hashing function for 8-byte keys. Under this design that could be done in a centralized and machine-independent fashion, whereas getting it right for keys of platform-dependent sizes would've been notationally painful before. For the moment, the old way still works fine, so as not to break source code compatibility for loadable modules. Eventually we might want to remove tag_hash and friends from the exported API altogether, since there's no real need for them to be explicitly referenced from outside dynahash.c. Teodor Sigaev and Tom Lane	2014-12-18 13:36:36 -05:00
Heikki Linnakangas	ba94518aad	Change how first WAL segment on new timeline after promotion is created. Two changes: 1. When copying a WAL segment from old timeline to create the first segment on the new timeline, only copy up to the point where the timeline switch happens, and zero-fill the rest. This avoids corner cases where we might think that the copied WAL from the previous timeline belong to the new timeline. 2. If the timeline switch happens at a segment boundary, don't copy the whole old segment to the new timeline. It's pointless, because it's 100% identical to the old segment.	2014-12-18 20:23:03 +02:00
Fujii Masao	38628db8d8	Add memory barriers for PgBackendStatus.st_changecount protocol. st_changecount protocol needs the memory barriers to ensure that the apparent order of execution is as it desires. Otherwise, for example, the CPU might rearrange the code so that st_changecount is incremented twice before the modification on a machine with weak memory ordering. This surprising result can lead to bugs. This commit introduces the macros to load and store st_changecount with the memory barriers. These are called before and after PgBackendStatus entries are modified or copied into private memory, in order to prevent CPU from reordering PgBackendStatus access. Per discussion on pgsql-hackers, we decided not to back-patch this to 9.4 or before until we get an actual bug report about this. Patch by me. Review by Robert Haas.	2014-12-18 23:07:51 +09:00
Fujii Masao	19e065c049	Ensure variables live across calls in generate_series(numeric, numeric). In generate_series_step_numeric(), the variables "start_num" and "stop_num" may be potentially freed until the next call. So they should be put in the location which can survive across calls. But previously they were not, and which could cause incorrect behavior of generate_series(numeric, numeric). This commit fixes this problem by copying them on multi_call_memory_ctx. Andrew Gierth	2014-12-18 21:13:52 +09:00
Fujii Masao	ccf292cd2e	Update .gitignore for config.cache. Also add a comment about why regreesion.* aren't listed in .gitignore. Jim Nasby	2014-12-18 19:56:42 +09:00
Andres Freund	72950dc1d0	Adjust valgrind suppression to the changes in `2c03216d83`. CRC computation is now done in XLogRecordAssemble.	2014-12-18 10:45:57 +01:00
Noah Misch	43b56171b1	Recognize Makefile line continuations in fetchRegressOpts(). Back-patch to 9.0 (all supported versions). This is mere future-proofing in the context of the master branch, but commit `f6dc6dd5ba` requires it of older branches.	2014-12-18 03:55:17 -05:00
Fujii Masao	26674c923d	Remove odd blank line in comment. Etsuro Fujita	2014-12-18 17:33:38 +09:00
Andres Freund	c303e9e7e5	Fix (re-)starting from a basebackup taken off a standby after a failure. When starting up from a basebackup taken off a standby extra logic has to be applied to compute the point where the data directory is consistent. Normal base backups use a WAL record for that purpose, but that isn't possible on a standby. That logic had a error check ensuring that the cluster's control file indicates being in recovery. Unfortunately that check was too strict, disregarding the fact that the control file could also indicate that the cluster was shut down while in recovery. That's possible when the a cluster starting from a basebackup is shut down before the backup label has been removed. When everything goes well that's a short window, but when either restore_command or primary_conninfo isn't configured correctly the window can get much wider. That's because inbetween reading and unlinking the label we restore the last checkpoint from WAL which can need additional WAL. To fix simply also allow starting when the control file indicates "shutdown in recovery". There's nicer fixes imaginable, but they'd be more invasive. Backpatch to 9.2 where support for taking basebackups from standbys was added.	2014-12-18 08:47:27 +01:00
Noah Misch	40c598fa15	Fix previous commit for TAP test suites in VPATH builds. Per buildfarm member crake. Back-patch to 9.4, where the TAP suites were introduced.	2014-12-18 01:24:57 -05:00
Noah Misch	f6dc6dd5ba	Lock down regression testing temporary clusters on Windows. Use SSPI authentication to allow connections exclusively from the OS user that launched the test suite. This closes on Windows the vulnerability that commit `be76a6d39e` closed on other platforms. Users of "make installcheck" or custom test harnesses can run "pg_regress --config-auth=DATADIR" to activate the same authentication configuration that "make check" would use. Back-patch to 9.0 (all supported versions). Security: CVE-2014-0067	2014-12-17 22:48:40 -05:00
Tom Lane	fc2ac1fb41	Allow CHECK constraints to be placed on foreign tables. As with NOT NULL constraints, we consider that such constraints are merely reports of constraints that are being enforced by the remote server (or other underlying storage mechanism). Their only real use is to allow planner optimizations, for example in constraint-exclusion checks. Thus, the code changes here amount to little more than removal of the error that was formerly thrown for applying CHECK to a foreign table. (In passing, do a bit of cleanup of the ALTER FOREIGN TABLE reference page, which had accumulated some weird decisions about ordering etc.) Shigeru Hanada and Etsuro Fujita, reviewed by Kyotaro Horiguchi and Ashutosh Bapat.	2014-12-17 17:00:53 -05:00
Heikki Linnakangas	ce01548d4f	Clarify the regexp used to detect source files in MSVC builds. The old pattern would match files with strange extensions like .ry or .lpp. Refactor it to only include files with known extensions, and to make it more readable. Per Andrew Dunstan's suggestion.	2014-12-17 21:55:26 +02:00
Tom Lane	c340494235	Fix another poorly worded error message. Spotted by Álvaro Herrera.	2014-12-17 13:22:07 -05:00
Tom Lane	c977b8cffc	Fix poorly worded error message. Adam Brightwell, per report from Martín Marqués.	2014-12-17 13:14:53 -05:00
Magnus Hagander	6964ad95d7	Add missing documentation for some vcregress modes Michael Paquier	2014-12-17 11:14:34 +01:00
Tom Lane	66709133c7	Fix off-by-one loop count in MapArrayTypeName, and get rid of static array. MapArrayTypeName would copy up to NAMEDATALEN-1 bytes of the base type name, which of course is wrong: after prepending '_' there is only room for NAMEDATALEN-2 bytes. Aside from being the wrong result, this case would lead to overrunning the statically allocated work buffer. This would be a security bug if the function were ever used outside bootstrap mode, but it isn't, at least not in any currently supported branches. Aside from fixing the off-by-one loop logic, this patch gets rid of the static work buffer by having MapArrayTypeName pstrdup its result; the sole caller was already doing that, so this just requires moving the pstrdup call. This saves a few bytes but mainly it makes the API a lot cleaner. Back-patch on the off chance that there is some third-party code using MapArrayTypeName with less-secure input. Pushing pstrdup into the function should not cause any serious problems for such hypothetical code; at worst there might be a short term memory leak. Per Coverity scanning.	2014-12-16 15:35:33 -05:00
Andrew Dunstan	c8315930e6	Fix some jsonb issues found by Coverity in recent commits. Mostly these issues concern the non-use of function results. These have been changed to use (void) pushJsonbValue(...) instead of assigning the result to a variable that gets overwritten before it is used. There is a larger issue that we should possibly examine the API for pushJsonbValue(), so that instead of returning a value it modifies a state argument. The current idiom is rather clumsy. However, changing that requires quite a bit more work, so this change should do for the moment.	2014-12-16 10:32:06 -05:00
Heikki Linnakangas	4d65e16a6f	Misc comment typo fixes. Backpatch the applicable parts, just to make backpatching future patches easier.	2014-12-16 16:37:46 +02:00
Heikki Linnakangas	da9f6a78ef	Fix incorrect comment about XLogRecordBlockHeader.data_length field. It does not include the possible full-page image. While at it, reformat the comment slightly to make it more readable. Reported by Rahila Syed	2014-12-16 15:41:58 +02:00
Noah Misch	0916eba131	Fix commit_ts test suite for systems with coarse timestamp granularity. Noticed on a couple of Windows configurations. Petr Jelinek, reviewed by Michael Paquier.	2014-12-15 20:56:09 -05:00
Peter Eisentraut	733a264ddc	Translation updates	2014-12-15 16:19:59 -05:00
Alvaro Herrera	4576b9cc46	add missing newline	2014-12-15 16:49:41 -03:00
Tom Lane	9418820efb	Fix point <-> polygon code for zero-distance case. "PG_RETURN_FLOAT8(x)" is not "return x", except perhaps by accident on some platforms.	2014-12-15 14:04:27 -05:00
Heikki Linnakangas	4520ba6769	Add point <-> polygon distance operator. Alexander Korotkov, reviewed by Emre Hasegeli.	2014-12-15 17:06:21 +02:00
Peter Eisentraut	ee3bec5e22	Translation updates	2014-12-15 00:25:35 -05:00
Andrew Dunstan	e39b6f953e	Add CINE option for CREATE TABLE AS and CREATE MATERIALIZED VIEW Fabrízio de Royes Mello reviewed by Rushabh Lathia.	2014-12-13 13:56:09 -05:00
Tom Lane	b0f479113a	Repair corner-case bug in array version of percentile_cont(). The code for advancing through the input rows overlooked the case that we might already be past the first row of the row pair now being considered, in case the previous percentile also fell between the same two input rows. Report and patch by Andrew Gierth; logic rewritten a bit for clarity by me.	2014-12-13 11:49:41 -05:00
Heikki Linnakangas	50f2c0687f	Remove duplicate #define Mark Dilger	2014-12-13 18:22:07 +02:00
Tom Lane	1c5c70df45	Avoid instability in output of new REINDEX SCHEMA test. The planner seems to like to do this join query as a hash join, making the output ordering machine-dependent; worse, it's a hash on OIDs, so that it's a bit astonishing that the result doesn't change from run to run even on one machine. Add an ORDER BY to get consistent results. Per buildfarm. I also suppressed output from the final DROP SCHEMA CASCADE, to avoid occasional failures similar to those fixed in commit `81d815dc3e`. That hasn't been observed in the buildfarm yet, but it seems likely to happen in future if we leave it as-is.	2014-12-12 15:49:09 -05:00
Andrew Dunstan	7e354ab9fe	Add several generator functions for jsonb that exist for json. The functions are: to_jsonb() jsonb_object() jsonb_build_object() jsonb_build_array() jsonb_agg() jsonb_object_agg() Also along the way some better logic is implemented in json_categorize_type() to match that in the newly implemented jsonb_categorize_type(). Andrew Dunstan, reviewed by Pavel Stehule and Alvaro Herrera.	2014-12-12 15:31:14 -05:00
Andrew Dunstan	237a882443	Add json_strip_nulls and jsonb_strip_nulls functions. The functions remove object fields, including in nested objects, that have null as a value. In certain cases this can lead to considerably smaller datums, with no loss of semantic information. Andrew Dunstan, reviewed by Pavel Stehule.	2014-12-12 09:00:43 -05:00
Heikki Linnakangas	b1332e98c4	Put the logic to decide which synchronous standby is active into a function. This avoids duplicating the code. Michael Paquier, reviewed by Simon Riggs and me	2014-12-12 14:26:42 +02:00
Peter Eisentraut	2f8607860b	SSL tests: Remove trailing blank lines	2014-12-11 21:33:58 -05:00
Peter Eisentraut	ce37eff06d	SSL tests: Silence pg_ctl output Otherwise the pg_ctl start and stop messages get mixed up with the TAP output, which isn't technically valid.	2014-12-11 21:32:30 -05:00
Tom Lane	462bd95705	Fix planning of SELECT FOR UPDATE on child table with partial index. Ordinarily we can omit checking of a WHERE condition that matches a partial index's condition, when we are using an indexscan on that partial index. However, in SELECT FOR UPDATE we must include the "redundant" filter condition in the plan so that it gets checked properly in an EvalPlanQual recheck. The planner got this mostly right, but improperly omitted the filter condition if the index in question was on an inheritance child table. In READ COMMITTED mode, this could result in incorrectly returning just-updated rows that no longer satisfy the filter condition. The cause of the error is using get_parse_rowmark() when get_plan_rowmark() is what should be used during planning. In 9.3 and up, also fix the same mistake in contrib/postgres_fdw. It's currently harmless there (for lack of inheritance support) but wrong is wrong, and the incorrect code might get copied to someplace where it's more significant. Report and fix by Kyotaro Horiguchi. Back-patch to all supported branches.	2014-12-11 21:02:25 -05:00
Tom Lane	2db576ba8c	Fix corner case where SELECT FOR UPDATE could return a row twice. In READ COMMITTED mode, if a SELECT FOR UPDATE discovers it has to redo WHERE-clause checking on rows that have been updated since the SELECT's snapshot, it invokes EvalPlanQual processing to do that. If this first occurs within a non-first child table of an inheritance tree, the previous coding could accidentally re-return a matching row from an earlier, already-scanned child table. (And, to add insult to injury, I think this could make it miss returning a row that should have been returned, if the updated row that this happens on should still have passed the WHERE qual.) Per report from Kyotaro Horiguchi; the added isolation test is based on his test case. This has been broken for quite awhile, so back-patch to all supported branches.	2014-12-11 19:37:36 -05:00
Simon Riggs	2646d2d4a9	Further changes to REINDEX SCHEMA Ensure we reindex indexes built on Mat Views. Based on patch from Micheal Paquier Add thorough tests to check that indexes on tables, toast tables and mat views are reindexed. Simon Riggs	2014-12-11 22:54:05 +00:00
Tom Lane	0845264642	Make rowsecurity test clean up after itself, too. Leaving global objects like roles hanging around is bad practice.	2014-12-11 17:45:35 -05:00
Tom Lane	58af84f4bb	Fix completely broken REINDEX SCHEMA testcase. Aside from not testing the case it claimed to test (namely a permissions failure), it left a login-capable role lying around, which quite aside from possibly being a security hole would cause subsequent regression runs to fail since the role would already exist.	2014-12-11 17:37:17 -05:00
Tom Lane	06d5803ffa	Fix assorted confusion between Oid and int32. In passing, also make some debugging elog's in pgstat.c a bit more consistently worded. Back-patch as far as applicable (9.3 or 9.4; none of these mistakes are really old). Mark Dilger identified and patched the type violations; the message rewordings are mine.	2014-12-11 15:41:15 -05:00
Heikki Linnakangas	10eb7dfa9b	Use correct macro for reltablespace. It's an OID. WRITE_UINT_FIELD is identical to WRITE_OID_FIELD, but let's be tidy. Mark Dilger	2014-12-11 10:19:50 +02:00
Peter Eisentraut	7442a88997	Fix typo Author: Fabrízio de Royes Mello <fabriziomello@gmail.com>	2014-12-10 20:55:30 -05:00
Tom Lane	24688f4e5a	Fix minor thinko in convertToJsonb(). The amount of space to reserve for the value's varlena header is VARHDRSZ, not sizeof(VARHDRSZ). The latter coding accidentally failed to fail because of the way the VARHDRSZ macro is currently defined; but if we ever change it to return size_t (as one might reasonably expect it to do), convertToJsonb() would have failed. Spotted by Mark Dilger.	2014-12-10 19:06:27 -05:00
Heikki Linnakangas	e39250c644	Add a regression test suite for SSL support. It's not run by the global "check" or "installcheck" targets, because the temporary installation it creates accepts TCP connections from any user the same host, which is insecure.	2014-12-09 17:37:20 +02:00
Simon Riggs	ae4e6887a4	Silence REINDEX Previously REINDEX DATABASE and REINDEX SCHEMA produced a stream of NOTICE messages. Removing that since it is inconsistent for such a command to produce output without a VERBOSE option.	2014-12-09 18:05:36 +09:00
Simon Riggs	1135aabab5	Execute 18 tests for src/bin/scripts/t/090.. Some requests count as two tests.	2014-12-09 01:51:02 +09:00
Simon Riggs	fe263d115a	REINDEX SCHEMA Add new SCHEMA option to REINDEX and reindexdb. Sawada Masahiko Reviewed by Michael Paquier and Fabrízio de Royes Mello	2014-12-09 00:28:00 +09:00
Simon Riggs	8001fe67a3	Windows: use GetSystemTimePreciseAsFileTime if available PostgreSQL on Windows 8 or Windows Server 2012 will now get high-resolution timestamps by dynamically loading the GetSystemTimePreciseAsFileTime function. It'll fall back to to GetSystemTimeAsFileTime if the higher precision variant isn't found, so the same binaries without problems on older Windows releases. No attempt is made to detect the Windows version. Only the presence or absence of the desired function is considered. Craig Ringer	2014-12-08 23:36:06 +09:00
Simon Riggs	519b0757a3	Use GetSystemTimeAsFileTime directly in win32 PostgreSQL was calling GetSystemTime followed by SystemTimeToFileTime in the win32 port gettimeofday function. This is not necessary and limits the reported precision to the 1ms granularity that the SYSTEMTIME struct can represent. By using GetSystemTimeAsFileTime we avoid unnecessary conversions and capture timestamps at 100ns granularity, which is then rounded to 1µs granularity for storage in a PostgreSQL timestamp. On most Windows systems this change will actually have no significant effect on timestamp resolution as the system timer tick is typically between 1ms and 15ms depending on what timer resolution currently running applications have requested. You can check this with clockres.exe from sysinternals. Despite the platform limiation this change still permits capture of finer timestamps where the system is capable of producing them and it gets rid of an unnecessary syscall. The higher resolution GetSystemTimePreciseAsFileTime call available on Windows 8 and Windows Server 2012 has the same interface as GetSystemTimeAsFileTime, so switching to GetSystemTimeAsFileTime makes it easier to use the Precise variant later. Craig Ringer, reviewed by David Rowley	2014-12-08 23:32:03 +09:00
Simon Riggs	c270754719	Remove duplicate code in heap_prune_chain() No need to set tuple tableOid twice Jim Nasby	2014-12-08 08:44:37 +09:00
Simon Riggs	618c9430a8	Event Trigger for table_rewrite Generate a table_rewrite event when ALTER TABLE attempts to rewrite a table. Provide helper functions to identify table and reason. Intended use case is to help assess or to react to schema changes that might hold exclusive locks for long periods. Dimitri Fontaine, triggering an edit by Simon Riggs Reviewed in detail by Michael Paquier	2014-12-08 00:55:28 +09:00
Simon Riggs	b8e33a85d4	Tweaks for recovery_target_action Rename parameter action_at_recovery_target to recovery_target_action suggested by Christoph Berg. Place into recovery.conf suggested by Fujii Masao, replacing (deprecating) earlier parameters, per Michael Paquier.	2014-12-07 21:55:29 +09:00
Heikki Linnakangas	198cbe0a0c	Give a proper error message if initdb password file is empty. Used to say just "could not read password from file "...": Success", which isn't very informative. Mats Erik Andersson. Backpatch to all supported versions.	2014-12-05 14:30:31 +02:00
Heikki Linnakangas	c0f279c469	Don't include file type bits in tar archive's mode field. The "file mode" bits in the tar file header is not supposed to include the file type bits, e.g. S_IFREG or S_IFDIR. The file type is stored in a separate field. This isn't a problem in practice, all tar programs ignore the extra bits, but let's be tidy. This came up in a discussion around bug #11949, reported by Hendrik Grewe, although this doesn't fix the issue with tar --append. That turned out to be a bug in GNU tar. Schilly's tartest program revealed this defect in the tar created by pg_basebackup. This problem goes as far as we we've had pg_basebackup, but since this hasn't caused any problems in practice, let's be conservative and fix in master only.	2014-12-05 13:54:21 +02:00
Heikki Linnakangas	b27b6e75af	Remove erroneous EXTRA_CLEAN line from Makefile. After commit `da34731`, these files are not generated files anymore. Adam Brightwell	2014-12-05 12:17:56 +02:00
Heikki Linnakangas	326b6f009f	Print new track_commit_timestamp in rm_desc of a parameter-change record. Michael Paquier	2014-12-05 12:11:43 +02:00
Heikki Linnakangas	c846e67c46	Print wal_log_hints in the rm_desc routing of a parameter-change record. It was an oversight in the original commit. Also note in the sample config file that changing wal_log_hints requires a restart. Michael Paquier. Backpatch to 9.4, where wal_log_hints was added.	2014-12-05 12:00:48 +02:00
Robert Haas	9a94629833	Don't dump core if pq_comm_reset() is called before pq_init(). This can happen if an error occurs in a standalone backend. This bug was introduced by commit `2bd9e412f9`. Reported by Álvaro Herrera.	2014-12-04 19:49:43 -05:00
Peter Eisentraut	b58233c71b	Fix PGXS vpath build when PostgreSQL is built with vpath PGXS computes srcdir from VPATH, PostgreSQL proper computes VPATH from srcdir, and doing both results in an error from make. Conditionalize so only one of these takes effect.	2014-12-04 17:02:02 -05:00
Peter Eisentraut	e4b5a070b4	Revert haphazard pgxs makefile changes These changes were originally submitted as "adds support for VPATH with USE_PGXS", but they are not necessary for VPATH support, so they just add more lines of code for no reason.	2014-12-04 08:07:59 -05:00
Peter Eisentraut	eb1c3f4786	Remove USE_VPATH make variable from PGXS The user can just set VPATH directly. There is no need to invent another variable.	2014-12-04 08:07:41 -05:00
Peter Eisentraut	1e95bbc870	Fix SHLIB_PREREQS use in contrib, allowing PGXS builds dblink and postgres_fdw use SHLIB_PREREQS = submake-libpq to build libpq first. This doesn't work in a PGXS build, because there is no libpq to build. So just omit setting SHLIB_PREREQS in this case. Note that PGXS users can still use SHLIB_PREREQS (although it is not documented). The problem here is only that contrib modules can be built in-tree or using PGXS, and the prerequisite is only applicable in the former case. Commit `6697aa2bc2` previously attempted to address this by creating a somewhat fake submake-libpq target in Makefile.global. That was not the right fix, and it was also done in a nonportable way, so revert that.	2014-12-04 07:58:12 -05:00
Peter Eisentraut	e86507d770	Move PG_AUTOCONF_FILENAME definition Since this is not something that a user should change, pg_config_manual.h was an inappropriate place for it. In initdb.c, remove the use of the macro, because utils/guc.h can't be included by non-backend code. But we hardcode all the other configuration file names there, so this isn't a disaster.	2014-12-03 19:54:01 -05:00
Alvaro Herrera	73c986adde	Keep track of transaction commit timestamps Transactions can now set their commit timestamp directly as they commit, or an external transaction commit timestamp can be fed from an outside system using the new function TransactionTreeSetCommitTsData(). This data is crash-safe, and truncated at Xid freeze point, same as pg_clog. This module is disabled by default because it causes a performance hit, but can be enabled in postgresql.conf requiring only a server restart. A new test in src/test/modules is included. Catalog version bumped due to the new subdirectory within PGDATA and a couple of new SQL functions. Authors: Álvaro Herrera and Petr Jelínek Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven Singer, Peter Eisentraut	2014-12-03 11:53:02 -03:00
Alvaro Herrera	6597ec9be6	Fix typos	2014-12-03 11:52:15 -03:00
Peter Eisentraut	bc2f43eaa4	Fix whitespace	2014-12-02 23:45:03 -05:00
Alvaro Herrera	da34731bd3	Install kludges to fix check-world for src/test/modules check-world failed in a completely clean tree, because src/test/modules fail to build unless errcodes.h is generated first. To fix this, install a dependency in src/test/modules' Makefile so that the necessary file is generated. Even with this, running "make check" within individual module subdirs will still fail because the dependency is not considered there, but this case is less interesting and would be messier to fix. check-world still failed with the above fix in place, this time because dummy_seclabel used LOAD to load the dynamic library, which doesn't work because the @libdir@ (expanded by the makefile) is expanded to the final install path, not the temporary installation directory used by make check. To fix, tweak things so that CREATE EXTENSION can be used instead, which solves the problem because the library path is expanded by the backend, which is aware of the true libdir.	2014-12-02 23:43:53 -03:00
Tom Lane	475aedd1ef	Improve error messages for malformed array input strings. Make the error messages issued by array_in() uniformly follow the style ERROR: malformed array literal: "actual input string" DETAIL: specific complaint here and rewrite many of the specific complaints to be clearer. The immediate motivation for doing this is a complaint from Josh Berkus that json_to_record() produced an unintelligible error message when dealing with an array item, because it tries to feed the JSON-format array value to array_in(). Really it ought to be smart enough to perform JSON-to-Postgres array conversion, but that's a future feature not a bug fix. In the meantime, this change is something we agreed we could back-patch into 9.4, and it should help de-confuse things a bit.	2014-12-02 18:23:27 -05:00
Andres Freund	0fd38e1370	Don't skip SQL backends in logical decoding for visibility computation. The logical decoding patchset introduced PROC_IN_LOGICAL_DECODING flag PGXACT flag, that allows such backends to be skipped when computing the xmin horizon/snapshots. That's fine and sensible for walsenders streaming out logical changes, but not at all fine for SQL backends doing logical decoding. If the latter set that flag any change they have performed outside of logical decoding will not be regarded as visible - which e.g. can lead to that change being vacuumed away. Note that not setting the flag for SQL backends isn't particularly bothersome - the SQL backend doesn't do streaming, so it only runs for a limited amount of time. Per buildfarm member 'tick' and Alvaro. Backpatch to 9.4, where logical decoding was introduced.	2014-12-02 23:47:08 +01:00
Tom Lane	75ef435218	Fix JSON aggregates to work properly when final function is re-executed. Davide S. reported that json_agg() sometimes produced multiple trailing right brackets. This turns out to be because json_agg_finalfn() attaches the final right bracket, and was doing so by modifying the aggregate state in-place. That's verboten, though unfortunately it seems there's no way for nodeAgg.c to check for such mistakes. Fix that back to 9.3 where the broken code was introduced. In 9.4 and HEAD, likewise fix json_object_agg(), which had copied the erroneous logic. Make some cosmetic cleanups as well.	2014-12-02 15:02:37 -05:00
Tom Lane	1511521a36	Minor cleanup of function declarations for BRIN. Get rid of PG_FUNCTION_INFO_V1() macros, which are quite inappropriate for built-in functions (possibly leftovers from testing as a loadable module?). Also, fix gratuitous inconsistency between SQL-level and C-level names of the minmax support functions.	2014-12-02 14:07:54 -05:00
Alvaro Herrera	3325624377	dummy_seclabel: add sql/, expected/, and .gitignores Michael Paquier	2014-12-02 11:14:56 -03:00
Tom Lane	0927bf8060	Guard against bad "dscale" values in numeric_recv(). We were not checking to see if the supplied dscale was valid for the given digit array when receiving binary-format numeric values. While dscale can validly be more than the number of nonzero fractional digits, it shouldn't be less; that case causes fractional digits to be hidden on display even though they're there and participate in arithmetic. Bug #12053 from Tommaso Sala indicates that there's at least one broken client library out there that sometimes supplies an incorrect dscale value, leading to strange behavior. This suggests that simply throwing an error might not be the best response; it would lead to failures in applications that might seem to be working fine today. What seems the least risky fix is to truncate away any digits that would be hidden by dscale. This preserves the existing behavior in terms of what will be printed for the transmitted value, while preventing subsequent arithmetic from producing results inconsistent with that. In passing, throw a specific error for the case of dscale being outside the range that will fit into a numeric's header. Before you got "value overflows numeric format", which is a bit misleading. Back-patch to all supported branches.	2014-12-01 15:25:02 -05:00
Alvaro Herrera	df761e3cf7	Move security_label test Rather than have the core security_label regression test depend on the dummy_seclabel module, have that part of the test be executed by dummy_seclabel itself directly. This simplifies the testing rig a bit; in particular it should silence the problems from the MSVC buildfarm phylum, which haven't yet gotten taught how to install src/test/modules.	2014-12-01 16:12:43 -03:00
Andrew Dunstan	e09996ff8d	Fix hstore_to_json_loose's detection of valid JSON number values. We expose a function IsValidJsonNumber that internally calls the lexer for json numbers. That allows us to use the same test everywhere, instead of inventing a broken test for hstore conversions. The new function is also used in datum_to_json, replacing the code that is now moved to the new function. Backpatch to 9.3 where hstore_to_json_loose was introduced.	2014-12-01 11:28:45 -05:00
Heikki Linnakangas	4e86f1b16d	Put SSL_pending() call behind the new internal SSL API. It seems likely that any SSL implementation will need a similar call, not just OpenSSL.	2014-12-01 17:45:04 +02:00
Tom Lane	866737c923	Add a #define for the inet overlaps operator. Extracted from pending inet selectivity patch. The rest of it isn't quite ready to commit, but we might as well push this part so the patch doesn't have to track the moving target of pg_operator.h.	2014-11-30 19:43:43 -05:00
Tom Lane	1adbb347ec	Fix minor bugs in commit `30bf4689a9` et al. Coverity complained that the "else" added to fillPGconn() was unreachable, which it was. Remove the dead code. In passing, rearrange the tests so as not to bother trying to fetch values for options that can't be assigned. Pre-9.3 did not have that issue, but it did have a "return" that should be "goto oom_error" to ensure that a suitable error message gets filled in.	2014-11-30 12:20:44 -05:00
Alvaro Herrera	22dfd116a1	Move test modules from contrib to src/test/modules This is advance preparation for introducing even more test modules; the easy solution is to add them to contrib, but that's bloated enough that it seems a good time to think of something different. Moved modules are dummy_seclabel, test_shm_mq, test_parser and worker_spi. (test_decoding was also a candidate, but there was too much opposition to moving that one. We can always reconsider later.)	2014-11-29 23:55:00 -03:00
Noah Misch	64f86fb11e	Reimplement `9f80f4835a` with PQconninfo(). Apart from ignoring "hostaddr" set to the empty string, this behaves identically to its predecessor. Back-patch to 9.4, where the original commit first appeared. Reviewed by Fujii Masao.	2014-11-29 12:31:43 -05:00
Noah Misch	2cda889984	Revert "Add libpq function PQhostaddr()." This reverts commit `9f80f4835a`. The function returned the raw value of a connection parameter, a task served by PQconninfo(). The next commit will reimplement the psql \conninfo change that way. Back-patch to 9.4, where that commit first appeared.	2014-11-29 12:31:21 -05:00
Alvaro Herrera	816e10d800	Fix BRIN operator family definitions The original definitions were leaving no room for cross-type operators, so queries that compared a column of one type against something of a different type were not taking advantage of the index. Fix by making the opfamilies more like the ones for Btree, and include a few cross-type operator classes. Catalog version bumped. Per complaints from Hubert Lubaczewski, Mark Wong, Heikki Linnakangas.	2014-11-28 18:09:19 -03:00
Alvaro Herrera	ae04bf5027	Update transaction README for persistent multixacts Multixacts are now maintained during recovery, but the README didn't get the memo. Backpatch to 9.3, where the divergence was introduced.	2014-11-28 18:06:18 -03:00
Tom Lane	d25367ec4f	Add bms_get_singleton_member(), and use it where appropriate. This patch adds a function that replaces a bms_membership() test followed by a bms_singleton_member() call, performing both the test and the extraction of a singleton set's member in one scan of the bitmapset. The performance advantage over the old way is probably minimal in current usage, but it seems worthwhile on notational grounds anyway. David Rowley	2014-11-28 14:16:24 -05:00
Tom Lane	f4e031c662	Add bms_next_member(), and use it where appropriate. This patch adds a way of iterating through the members of a bitmapset nondestructively, unlike the old way with bms_first_member(). While bms_next_member() is very slightly slower than bms_first_member() (at least for typical-size bitmapsets), eliminating the need to palloc and pfree a temporary copy of the target bitmapset is a significant win. So this method should be preferred in all cases where a temporary copy would be necessary. Tom Lane, with suggestions from Dean Rasheed and David Rowley	2014-11-28 13:37:25 -05:00
Tom Lane	96d66bcfc6	Improve performance of OverrideSearchPathMatchesCurrent(). This function was initially coded on the assumption that it would not be performance-critical, but that turns out to be wrong in workloads that are heavily dependent on the speed of plpgsql functions. Speed it up by hard-coding the comparison rules, thereby avoiding palloc/pfree traffic from creating and immediately freeing an OverrideSearchPath object. Per report from Scott Marlowe.	2014-11-28 12:37:27 -05:00
Tom Lane	e384ed6cde	Improve typcache: cache negative lookup results, add invalidation logic. Previously, if the typcache had for example tried and failed to find a hash opclass for a given data type, it would nonetheless repeat the unsuccessful catalog lookup each time it was asked again. This can lead to a significant amount of useless bufmgr traffic, as in a recent report from Scott Marlowe. Like the catalog caches, typcache should be able to cache negative results. This patch arranges that by making use of separate flag bits to remember whether a particular item has been looked up, rather than treating a zero OID as an indicator that no lookup has been done. Also, install a credible invalidation mechanism, namely watching for inval events in pg_opclass. The sole advantage of the lack of negative caching was that the code would cope if operators or opclasses got added for a type mid-session; to preserve that behavior we have to be able to invalidate stale lookup results. Updates in pg_opclass should be pretty rare in production systems, so it seems sufficient to just invalidate all the dependent data whenever one happens. Adding proper invalidation also means that this code will now react sanely if an opclass is dropped mid-session. Arguably, that's a back-patchable bug fix, but in view of the lack of complaints from the field I'll refrain from back-patching. (Probably, in most cases where an opclass is dropped, the data type itself is dropped soon after, so that this misfeasance has no bad consequences.)	2014-11-28 12:19:14 -05:00
Fujii Masao	202cbdf782	Add tab-completion for ALTER TABLE ALTER CONSTRAINT in psql. Back-patch to 9.4 where ALTER TABLE ALTER CONSTRAINT was added. Michael Paquier, bug reported by Andrey Lizenko.	2014-11-28 21:29:45 +09:00
Heikki Linnakangas	afeacd2748	Fix assertion failure at end of PITR. InitXLogInsert() cannot be called in a critical section, because it allocates memory. But CreateCheckPoint() did that, when called for the end-of-recovery checkpoint by the startup process. In the passing, fix the scratch space allocation in InitXLogInsert to go to the right memory context. Also update the comment at InitXLOGAccess, which hasn't been totally accurate since hot standby was introduced (in a hot standby backend, InitXLOGAccess isn't called at backend startup). Reported by Michael Paquier	2014-11-28 09:31:53 +02:00
Fujii Masao	a5eb85eb62	Make \watch respect the user's \pset null setting. Previously \watch always ignored the user's \pset null setting. \pset null setting should be ignored for \d and similar queries. For those, the code can reasonably have an opinion about what the presentation should be like, since it knows what SQL query it's issuing. This argument surely doesn't apply to \watch, so this commit makes \watch use the user's \pset null setting. Back-patch to 9.3 where \watch was added.	2014-11-28 02:42:43 +09:00
Fujii Masao	e656f5d247	Mark response messages for translation in pg_isready. Back-patch to 9.3 where pg_isready was added. Mats Erik Andersson	2014-11-28 02:12:45 +09:00
Stephen Frost	143b39c185	Rename pg_rowsecurity -> pg_policy and other fixes As pointed out by Robert, we should really have named pg_rowsecurity pg_policy, as the objects stored in that catalog are policies. This patch fixes that and updates the column names to start with 'pol' to match the new catalog name. The security consideration for COPY with row level security, also pointed out by Robert, has also been addressed by remembering and re-checking the OID of the relation initially referenced during COPY processing, to make sure it hasn't changed under us by the time we finish planning out the query which has been built. Robert and Alvaro also commented on missing OCLASS and OBJECT entries for POLICY (formerly ROWSECURITY or POLICY, depending) in various places. This patch fixes that too, which also happens to add the ability to COMMENT on policies. In passing, attempt to improve the consistency of messages, comments, and documentation as well. This removes various incarnations of 'row-security', 'row-level security', 'Row-security', etc, in favor of 'policy', 'row level security' or 'row_security' as appropriate. Happy Thanksgiving!	2014-11-27 01:15:57 -05:00
Heikki Linnakangas	1812ee5767	Remove dead function prototype It was added in commit `efc16ea5`, but never defined.	2014-11-26 11:05:54 +02:00
Robert Haas	a6c84c770e	Attempt to suppress uninitialized variable warning. Report by Heikki Linnakangas.	2014-11-25 20:07:07 -05:00
Tom Lane	d934a05234	Fix uninitialized-variable warning. In passing, add an Assert defending the presumption that bytes_left is positive to start with. (I'm not exactly convinced that using an unsigned type was such a bright thing here, but let's at least do this much.)	2014-11-25 15:17:16 -05:00
Simon Riggs	aedccb1f6f	action_at_recovery_target recovery config option action_at_recovery_target = pause \| promote \| shutdown Petr Jelinek Reviewed by Muhammad Asif Naeem, Fujji Masao and Simon Riggs	2014-11-25 20:13:30 +00:00
Tom Lane	bb1b8f694a	De-reserve most statement-introducing keywords in plpgsql. Add a bit of context sensitivity to plpgsql_yylex() so that it can recognize when the word it is looking at is the first word of a new statement, and if so whether it is the target of an assignment statement. When we are at start of statement and it's not an assignment, we can prefer recognizing unreserved keywords over recognizing variable names, thereby allowing most statements' initial keywords to be demoted from reserved to unreserved status. This is rather useful already (there are 15 such words that get demoted here), and what's more to the point is that future patches proposing to add new plpgsql statements can avoid objections about having to add new reserved words. The keywords BEGIN, DECLARE, FOR, FOREACH, LOOP, WHILE need to remain reserved because they can be preceded by block labels, and the logic added here doesn't understand about block labels. In principle we could probably fix that, but it would take more than one token of lookback and the benefit doesn't seem worth extra complexity. Also note I didn't de-reserve EXECUTE, because it is used in more places than just statement start. It's possible it could be de-reserved with more work, but that would be an independent fix. In passing, also de-reserve COLLATE and DEFAULT, which shouldn't have been reserved in the first place since they only need to be recognized within DECLARE sections.	2014-11-25 15:02:09 -05:00
Tom Lane	bac27394a1	Support arrays as input to array_agg() and ARRAY(SELECT ...). These cases formerly failed with errors about "could not find array type for data type". Now they yield arrays of the same element type and one higher dimension. The implementation involves creating functions with API similar to the existing accumArrayResult() family. I (tgl) also extended the base family by adding an initArrayResult() function, which allows callers to avoid special-casing the zero-inputs case if they just want an empty array as result. (Not all do, so the previous calling convention remains valid.) This allowed simplifying some existing code in xml.c and plperl.c. Ali Akbar, reviewed by Pavel Stehule, significantly modified by me	2014-11-25 12:21:28 -05:00
Stephen Frost	25976710df	Add int64 -> int8 mapping to genbki Per discussion with Tom and Andrew, 64bit integers are no longer a problem for the catalogs, so go ahead and add the mapping from the C int64 type to the int8 SQL identification to allow using them. Patch by Adam Brightwell	2014-11-25 12:12:19 -05:00
Heikki Linnakangas	b3fc6727ce	Allow using connection URI in primary_conninfo. The old method of appending options to the connection string didn't work if the primary_conninfo was a postgres:// style URI, instead of a traditional connection string. Use PQconnectdbParams instead. Alex Shulgin	2014-11-25 18:26:05 +02:00
Heikki Linnakangas	add1b052e2	Allow "dbname" from connection string to be overridden in PQconnectDBParams If the "dbname" attribute in PQconnectDBParams contained a connection string or URI (and expand_dbname = TRUE), the database name from the connection string could not be overridden by a subsequent "dbname" keyword in the array. That was not intentional; all other options can be overridden. Furthermore, any subsequent "dbname" caused the connection string from the first dbname value to be processed again, overriding any values for the same options that were given between the connection string and the second dbname option. In the passing, clarify in the docs that only the first dbname option in the array is parsed as a connection string. Alex Shulgin. Backpatch to all supported versions.	2014-11-25 17:39:44 +02:00
Stephen Frost	81d815dc3e	Suppress DROP CASCADE notices in regression tests In the regression tests, when doing cascaded drops, we need to suppress the notices from DROP CASCADE or there can be transient regression failures as the order of drops can depend on the physical row order in pg_depend. Report and fix suggestion from Tom.	2014-11-25 10:04:49 -05:00
Heikki Linnakangas	30bf4689a9	Check return value of strdup() in libpq connection option parsing. An out-of-memory in most of these would lead to strange behavior, like connecting to a different database than intended, but some would lead to an outright segfault. Alex Shulgin and me. Backpatch to all supported versions.	2014-11-25 14:10:16 +02:00
Heikki Linnakangas	e453cc2741	Make Port->ssl_in_use available, even when built with !USE_SSL Code that check the flag no longer need #ifdef's, which is more convenient. In particular, makes it easier to write extensions that depend on it. In the passing, modify sslinfo's ssl_is_used function to check ssl_in_use instead of the OpenSSL specific 'ssl' pointer. It doesn't make any difference currently, as sslinfo is only compiled when built with OpenSSL, but seems cleaner anyway.	2014-11-25 09:46:11 +02:00
Robert Haas	f5d9698a84	Add infrastructure to save and restore GUC values. This is further infrastructure for parallelism. Amit Khandekar, Noah Misch, Robert Haas	2014-11-24 16:37:56 -05:00
Heikki Linnakangas	49b86fb1c9	Add a few paragraphs to B-tree README explaining L&Y algorithm. This gives an overview of what Lehman & Yao's paper is all about, so that you can understand the rest of the README without having to read the paper. Per discussion with Peter Geoghegan and others.	2014-11-24 13:43:33 +02:00
Heikki Linnakangas	0bd624d63b	Distinguish XLOG_FPI records generated for hint-bit updates. Add a new XLOG_FPI_FOR_HINT record type, and use that for full-page images generated for hint bit updates, when checksums are enabled. The new record type is replayed exactly the same as XLOG_FPI, but allows them to be tallied separately e.g. in pg_xlogdump.	2014-11-24 11:09:08 +02:00
Tom Lane	e2dc3f5772	Get rid of redundant production in plpgsql grammar. There may once have been a reason for the intermediate proc_stmts production in the plpgsql grammar, but it isn't doing anything useful anymore, so let's collapse it into proc_sect. Saves some code and probably a small number of nanoseconds per statement list. In passing, correctly alphabetize keyword lists to match pl_scanner.c; note that for "rowtype" vs "row_count", pl_scanner.c must sort on the basis of the lower-case spelling. Noted while fooling with a patch to de-reserve more plpgsql keywords.	2014-11-23 15:31:36 -05:00
Andrew Dunstan	02d5ab6a86	Fix memory leaks introduced by commit `eca2b9b`	2014-11-23 13:47:08 -05:00
Noah Misch	b779168ffe	Detect PG_PRINTF_ATTRIBUTE automatically. This eliminates gobs of "unrecognized format function type" warnings under MinGW compilers predating GCC 4.4.	2014-11-23 09:34:03 -05:00
Tom Lane	b62f94c603	Allow simplification of EXISTS() subqueries containing LIMIT. The locution "EXISTS(SELECT ... LIMIT 1)" seems to be rather common among people who don't realize that the database already performs optimizations equivalent to putting LIMIT 1 in the sub-select. Unfortunately, this was actually making things worse, because it prevented us from optimizing such EXISTS clauses into semi or anti joins. Teach simplify_EXISTS_query() to suppress constant-positive LIMIT clauses. That fixes the semi/anti-join case, and may help marginally even for cases that have to be left as sub-SELECTs. Marti Raudsepp, reviewed by David Rowley	2014-11-22 19:12:38 -05:00
Tom Lane	9c58101117	Fix mishandling of system columns in FDW queries. postgres_fdw would send query conditions involving system columns to the remote server, even though it makes no effort to ensure that system columns other than CTID match what the remote side thinks. tableoid, in particular, probably won't match and might have some use in queries. Hence, prevent sending conditions that include non-CTID system columns. Also, create_foreignscan_plan neglected to check local restriction conditions while determining whether to set fsSystemCol for a foreign scan plan node. This again would bollix the results for queries that test a foreign table's tableoid. Back-patch the first fix to 9.3 where postgres_fdw was introduced. Back-patch the second to 9.2. The code is probably broken in 9.1 as well, but the patch doesn't apply cleanly there; given the weak state of support for FDWs in 9.1, it doesn't seem worth fixing. Etsuro Fujita, reviewed by Ashutosh Bapat, and somewhat modified by me	2014-11-22 16:01:05 -05:00
Andrew Dunstan	eca2b9ba3e	Rework echo_hidden for \sf and \ef from commit `e4d2817`. PSQLexec's error reporting turns out to be too verbose for this case, so revert to using PQexec instead with minimal error reporting. Prior to calling PQexec, we call a function that mimics just the echo_hidden piece of PSQLexec.	2014-11-22 09:39:01 -05:00
Tom Lane	447770404c	Rearrange CustomScan API. Make it work more like FDW plans do: instead of assuming that there are expressions in a CustomScan plan node that the core code doesn't know about, insist that all subexpressions that need planner attention be in a "custom_exprs" list in the Plan representation. (Of course, the custom plugin can break the list apart again at executor initialization.) This lets us revert the parts of the patch that exposed setrefs.c and subselect.c processing to the outside world. Also revert the GetSpecialCustomVar stuff in ruleutils.c; that concept may work in future, but it's far from fully baked right now.	2014-11-21 18:21:46 -05:00
Tom Lane	c2ea2285e9	Simplify API for initially hooking custom-path providers into the planner. Instead of register_custom_path_provider and a CreateCustomScanPath callback, let's just provide a standard function hook in set_rel_pathlist. This is more flexible than what was previously committed, is more like the usual conventions for planner hooks, and requires less support code in the core. We had discussed this design (including centralizing the set_cheapest() calls) back in March or so, so I'm not sure why it wasn't done like this already.	2014-11-21 14:05:46 -05:00
Andrew Dunstan	4077fb4d1d	Fix an error in psql that overcounted output lines. This error counted the first line of a cell as "extra". The effect was to cause far too frequent invocation of the pager. In most cases this can be worked around (for example, by using the "less" pager with the -F flag), so don't backpatch.	2014-11-21 12:37:09 -05:00
Andrew Dunstan	e4d28175a1	Make psql's \sf and \ef honor ECHO_HIDDEN. These commands were calling the database direct rather than calling PSQLexec like other slash commands that needed database data. The code is also changed not to pass the connection as a parameter to the helper functions. It's available in a global variable, and that's what PSQLexec uses.	2014-11-21 12:14:05 -05:00
Heikki Linnakangas	622983ea69	No need to call XLogEnsureRecordSpace when the relation is unlogged. Amit Kapila	2014-11-21 15:13:15 +02:00
Heikki Linnakangas	b10a97b819	Add a comment to regress.c explaining what it contains. Ian Barwick	2014-11-21 15:07:29 +02:00
Heikki Linnakangas	8f5dcb56cb	Fix bogus comments in XLogRecordAssemble Pointed out by Michael Paquier	2014-11-21 12:15:27 +02:00
Tom Lane	adbfab119b	Remove dead code supporting mark/restore in SeqScan, TidScan, ValuesScan. There seems no prospect that any of this will ever be useful, and indeed it's questionable whether some of it would work if it ever got called; it's certainly not been exercised in a very long time, if ever. So let's get rid of it, and make the comments about mark/restore in execAmi.c less wishy-washy. The mark/restore support for Result nodes is also currently dead code, but that's due to planner limitations not because it's impossible that it could be useful. So I left it in.	2014-11-20 20:20:54 -05:00
Tom Lane	a34fa8ee7c	Initial code review for CustomScan patch. Get rid of the pernicious entanglement between planner and executor headers introduced by commit `0b03e5951b`. Also, rearrange the CustomFoo struct/typedef definitions so that all the typedef names are seen as used by the compiler. Without this pgindent will mess things up a bit, which is not so important perhaps, but it also removes a bizarre discrepancy between the declaration arrangement used for CustomExecMethods and that used for CustomScanMethods and CustomPathMethods. Clean up the commentary around ExecSupportsMarkRestore to reflect the rather large change in its API. Const-ify register_custom_path_provider's argument. This necessitates casting away const in the function, but that seems better than forcing callers of the function to do so (or else not const-ify their method pointer structs, which was sort of the whole point). De-export fix_expr_common. I don't like the exporting of fix_scan_expr or replace_nestloop_params either, but this one surely has got little excuse.	2014-11-20 18:36:07 -05:00
Tom Lane	081a6048cf	Fix another oversight in CustomScan patch. execCurrent.c's search_plan_tree() must recognize a CustomScan on the target relation. This would only be helpful for custom providers that support CurrentOfExpr quals, which is probably a bit far-fetched, but it's not impossible I think. But even without assuming that, we need to recognize a scanned-relation match so that we will properly throw error if the desired relation is being scanned with both a CustomScan and a regular scan (ie, self-join). Also recognize ForeignScanState for similar reasons. Supporting WHERE CURRENT OF on a foreign table is probably even more far-fetched than it is for custom scans, but I think in principle you could do it with postgres_fdw (or another FDW that supports the ctid column). This would be a back-patchable bug fix if existing FDWs handled CurrentOfExpr, but I doubt any do so I won't bother back-patching.	2014-11-20 15:56:39 -05:00
Tom Lane	03e574af5f	Fix another oversight in CustomScan patch. disuse_physical_tlist() must work for all plan types handled by create_scan_plan().	2014-11-20 14:49:02 -05:00
Tom Lane	c5111ea9ca	Remove no-longer-needed phony typedefs in genbki.h. Now that we have a policy of hiding varlena catalog fields behind "#ifdef CATALOG_VARLEN", there is no need for their type names to be acceptable to the C compiler. And experimentation shows that it does not matter to pgindent either. (If it did, we'd have problems anyway, since these typedefs are unreferenced so far as the C compiler is concerned, and find_typedef fails to identify such typedefs.) Hence, remove the phony typedefs that genbki.h provided to make some varlena field definitions compilable. In passing, rearrange #define's into what seemed a more logical order.	2014-11-20 13:16:14 -05:00
Tom Lane	f9e0255c6f	Add missing case for CustomScan. Per KaiGai Kohei. In passing improve formatting of some code added in commit `30d7ae3c`, because otherwise pgindent will make a mess of it.	2014-11-20 12:32:34 -05:00
Heikki Linnakangas	f464042161	Silence compiler warning about variable being used uninitialized. It's a false positive - the variable is only used when 'onleft' is true, and it is initialized in that case. But the compiler doesn't necessarily see that.	2014-11-20 19:17:19 +02:00
Heikki Linnakangas	2c03216d83	Revamp the WAL record format. Each WAL record now carries information about the modified relation and block(s) in a standardized format. That makes it easier to write tools that need that information, like pg_rewind, prefetching the blocks to speed up recovery, etc. There's a whole new API for building WAL records, replacing the XLogRecData chains used previously. The new API consists of XLogRegister* functions, which are called for each buffer and chunk of data that is added to the record. The new API also gives more control over when a full-page image is written, by passing flags to the XLogRegisterBuffer function. This also simplifies the XLogReadBufferForRedo() calls. The function can dig the relation and block number from the WAL record, so they no longer need to be passed as arguments. For the convenience of redo routines, XLogReader now disects each WAL record after reading it, copying the main data part and the per-block data into MAXALIGNed buffers. The data chunks are not aligned within the WAL record, but the redo routines can assume that the pointers returned by XLogRecGet* functions are. Redo routines are now passed the XLogReaderState, which contains the record in the already-disected format, instead of the plain XLogRecord. The new record format also makes the fixed size XLogRecord header smaller, by removing the xl_len field. The length of the "main data" portion is now stored at the end of the WAL record, and there's a separate header after XLogRecord for it. The alignment padding at the end of XLogRecord is also removed. This compansates for the fact that the new format would otherwise be more bulky than the old format. Reviewed by Andres Freund, Amit Kapila, Michael Paquier, Alvaro Herrera, Fujii Masao.	2014-11-20 18:46:41 +02:00
Peter Eisentraut	8dc626defe	Fix suggested layout for PGXS makefile Custom rules must come after pgxs inclusion, not before, because any rule added before pgxs will break the default 'all' target. Author: Cédric Villemain <cedric@2ndquadrant.fr>	2014-11-19 22:21:54 -05:00
Heikki Linnakangas	88fc719263	Add test cases for indexam operations not currently covered. That includes VACUUM on GIN, GiST and SP-GiST indexes, and B-tree indexes large enough to cause page deletions in B-tree. Plus some other special cases. After this patch, the regression tests generate all different WAL record types. Not all branches within the redo functions are covered, but it's a step forward.	2014-11-19 19:47:43 +02:00
Fujii Masao	d5f4df7264	Fix bug in the test of file descriptor of current WAL file in pg_receivexlog. In pg_receivexlog, in order to check whether the current WAL file is being opened or not, its file descriptor has to be checked against -1 as an invalid value. But, oops, `7900e94` added the incorrect test checking the descriptor against 1. This commit fixes that bug. Back-patch to 9.4 where the bug was added. Spotted by Magnus Hagander	2014-11-19 19:10:04 +09:00
Fujii Masao	f66c20b317	Fix pg_receivexlog --slot so that it doesn't prevent the server shutdown. When pg_receivexlog --slot is connecting to the server, at the shutdown of the server, walsender keeps waiting for the last WAL record to be replicated and flushed in pg_receivexlog. But previously pg_receivexlog issued sync command only when WAL file was switched. So there was the case where the last WAL was never flushed and walsender had to keep waiting infinitely. This caused the server shutdown to get stuck. pg_recvlogical handles this problem by calling fsync() when it receives the request of immediate reply from the server. That is, at shutdown, walsender sends the request, pg_recvlogical receives it, flushes the last WAL record, and sends the flush location back to the server. Since walsender can see that the last WAL record is successfully flushed, it can exit cleanly. This commit introduces the same logic as pg_recvlogical has, to pg_receivexlog. Back-patch to 9.4 where pg_receivexlog was changed so that it can use the replication slot. Original patch by Michael Paquier, rewritten by me. Bug report by Furuya Osamu.	2014-11-19 14:11:12 +09:00
Tom Lane	8d7af8fbe7	Don't require bleeding-edge timezone data in timestamptz regression test. The regression test cases added in commits `b2cbced9e` et al depended in part on the Russian timezone offset changes of Oct 2014. While this is of no particular concern for a default Postgres build, it was possible for a build using --with-system-tzdata to fail the tests if the system tzdata database wasn't au courant. Bjorn Munch and Christoph Berg both complained about this while packaging 9.4rc1, so we probably shouldn't insist on the system tzdata being up-to-date. Instead, make an equivalent test using a zone change that occurred in Venezuela in 2007. With this patch, the regression tests should pass using any tzdata set from 2012 or later. (I can't muster much sympathy for somebody using --with-system-tzdata on a machine whose system tzdata is more than three years out-of-date.)	2014-11-18 21:36:39 -05:00
Tom Lane	7aa8d9e56c	Update comments in find_typedef. These comments don't seem to have been touched in a long time. Make them describe the current implementation rather than what was here last century, and be a bit more explicit about the unreferenced-typedefs issue.	2014-11-18 15:51:45 -05:00
Tom Lane	8b13e5c6c0	Fix some bogus direct uses of realloc(). pg_dump/parallel.c was using realloc() directly with no error check. While the odds of an actual failure here seem pretty low, Coverity complains about it, so fix by using pg_realloc() instead. While looking for other instances, I noticed a couple of places in psql that hadn't gotten the memo about the availability of pg_realloc. These aren't bugs, since they did have error checks, but verbosely inconsistent code is not a good thing. Back-patch as far as 9.3. 9.2 did not have pg_dump/parallel.c, nor did it have pg_realloc available in all frontend code.	2014-11-18 13:28:06 -05:00
Simon Riggs	606c0123d6	Reduce btree scan overhead for < and > strategies For <, <=, > and >= strategies, mark the first scan key as already matched if scanning in an appropriate direction. If index tuple contains no nulls we can skip the first re-check for each tuple. Author: Rajeev Rastogi Reviewer: Haribabu Kommi Rework of the code and comments by Simon Riggs	2014-11-18 10:24:55 +00:00
Heikki Linnakangas	dedae6c211	Remove obsolete debugging option, RTDEBUG. The r-tree AM that used it was removed back in 2005. Peter Geoghegan	2014-11-18 09:55:05 +02:00
Simon Riggs	be1cc8f46f	Add pg_dump --snapshot option Allows pg_dump to use a snapshot previously defined by a concurrent session that has either used pg_export_snapshot() or obtained a snapshot when creating a logical slot. When this option is used with parallel pg_dump, the snapshot defined by this option is used and no new snapshot is taken. Simon Riggs and Michael Paquier	2014-11-17 22:15:07 +00:00
Fujii Masao	c4f99d2029	Add --synchronous option to pg_receivexlog, for more reliable WAL writing. Previously pg_receivexlog flushed WAL data only when WAL file was switched. Then `3dad73e` added -F option to pg_receivexlog so that users could control how frequently sync commands were issued to WAL files. It also allowed users to make pg_receivexlog flush WAL data immediately after writing by specifying 0 in -F option. However feedback messages were not sent back immediately even after a flush location was updated. So even if WAL data was flushed in real time, the server could not see that for a while. This commit removes -F option from and adds --synchronous to pg_receivexlog. If --synchronous is specified, like the standby's wal receiver, pg_receivexlog flushes WAL data as soon as there is WAL data which has not been flushed yet. Then it sends back the feedback message identifying the latest flush location to the server. This option is useful to make pg_receivexlog behave as sync standby by using replication slot, for example. Original patch by Furuya Osamu, heavily rewritten by me. Reviewed by Heikki Linnakangas, Alvaro Herrera and Sawada Masahiko.	2014-11-18 02:32:48 +09:00
Tom Lane	bc241488b0	Update time zone data files to tzdata release 2014j. DST law changes in the Turks & Caicos Islands (America/Grand_Turk) and in Fiji. New zone Pacific/Bougainville for portions of Papua New Guinea. Historical changes for Korea and Vietnam.	2014-11-17 12:09:12 -05:00
Heikki Linnakangas	c73669c0e0	Fix WAL-logging of B-tree "unlink halfdead page" operation. There was some confusion on how to record the case that the operation unlinks the last non-leaf page in the branch being deleted. _bt_unlink_halfdead_page set the "topdead" field in the WAL record to the leaf page, but the redo routine assumed that it would be an invalid block number in that case. This commit fixes _bt_unlink_halfdead_page to do what the redo routine expected. This code is new in 9.4, so backpatch there.	2014-11-17 18:45:46 +02:00
Alvaro Herrera	0f9692b40d	Fix relpersistence setting in reindex_index Buildfarm members with CLOBBER_CACHE_ALWAYS advised us that commit `85b506bbfc` was mistaken in setting the relpersistence value of the index directly in the relcache entry, within reindex_index. The reason for the failure is that an invalidation message that comes after mucking with the relcache entry directly, but before writing it to the catalogs, would cause the entry to become rebuilt in place from catalogs with the old contents, losing the update. Fix by passing the correct persistence value to RelationSetNewRelfilenode instead; this routine also writes the updated tuple to pg_class, avoiding the problem. Suggested by Tom Lane.	2014-11-17 11:23:35 -03:00
Peter Eisentraut	7466a1b75f	Translation updates	2014-11-16 21:32:51 -05:00
Simon Riggs	0f66d21201	Emit msg re skipping ANALYZE for absent inh tree When checking a table that has an inheritance tree marked, if no child tables remain, we skip ANALYZE. This patch emits a message to show that the action has been skipped. Author: Etsuro Fujita Reviewer: Furuya Osamu	2014-11-15 22:49:54 +00:00
Alvaro Herrera	85b506bbfc	Get rid of SET LOGGED indexes persistence kludge This removes ATChangeIndexesPersistence() introduced by `f41872d0c1` which was too ugly to live for long. Instead, the correct persistence marking is passed all the way down to reindex_index, so that the transient relation built to contain the index relfilenode can get marked correctly right from the start. Author: Fabrízio de Royes Mello Review and editorialization by Michael Paquier and Álvaro Herrera	2014-11-15 01:19:49 -03:00
Alvaro Herrera	e4d1e26491	Remove unused InhPaths Allegedly, the last remaining usages of that struct were removed by `0e99be1c`. Author: Peter Geoghegan	2014-11-15 01:19:39 -03:00
Andres Freund	522c85a6a2	Fix initdb --sync-only to also sync tablespaces. `630cd14426` added initdb --sync-only, for use by pg_upgrade, by just exposing the existing fsync code. That's wrong, because initdb so far had absolutely no reason to deal with tablespaces. Fix --sync-only by additionally explicitly syncing each of the tablespaces. Backpatch to 9.3 where --sync-only was introduced. Abhijit Menon-Sen and Andres Freund	2014-11-15 01:19:40 +01:00
Andres Freund	98ec7fd903	Sync unlogged relations to disk after they have been reset. Unlogged relations are only reset when performing a unclean restart. That means they have to be synced to disk during clean shutdowns. During normal processing that's achieved by registering a buffer's file to be fsynced at the next checkpoint when flushed. But ResetUnloggedRelations() doesn't go through the buffer manager, so nothing will force reset relations to disk before the next shutdown checkpoint. So just make ResetUnloggedRelations() fsync the newly created main forks to disk. Discussion: 20140912112246.GA4984@alap3.anarazel.de Backpatch to 9.1 where unlogged tables were introduced. Abhijit Menon-Sen and Andres Freund	2014-11-15 01:19:31 +01:00
Andres Freund	d3586fc8aa	Ensure unlogged tables are reset even if crash recovery errors out. Unlogged relations are reset at the end of crash recovery as they're only synced to disk during a proper shutdown. Unfortunately that and later steps can fail, e.g. due to running out of space. This reset was, up to now performed after marking the database as having finished crash recovery successfully. As out of space errors trigger a crash restart that could lead to the situation that not all unlogged relations are reset. Once that happend usage of unlogged relations could yield errors like "could not open file "...": No such file or directory". Luckily clusters that show the problem can be fixed by performing a immediate shutdown, and starting the database again. To fix, just call ResetUnloggedRelations(UNLOGGED_RELATION_INIT) earlier, before marking the database as having successfully recovered. Discussion: 20140912112246.GA4984@alap3.anarazel.de Backpatch to 9.1 where unlogged tables were introduced. Abhijit Menon-Sen and Andres Freund	2014-11-15 01:19:26 +01:00
Stephen Frost	80eacaa3cd	Clean up includes from RLS patch The initial patch for RLS mistakenly included headers associated with the executor and planner bits in rewrite/rowsecurity.h. Per policy and general good sense, executor headers should not be included in planner headers or vice versa. The include of execnodes.h was a mistaken holdover from previous versions, while the include of relation.h was used for Relation's definition, which should have been coming from utils/relcache.h. This patch cleans these issues up, adds comments to the RowSecurityPolicy struct and the RowSecurityConfigType enum, and changes Relation->rsdesc to Relation->rd_rsdesc to follow Relation field naming convention. Additionally, utils/rel.h was including rewrite/rowsecurity.h, which wasn't a great idea since that was pulling in things not really needed in utils/rel.h (which gets included in quite a few places). Instead, use 'struct RowSecurityDesc' for the rd_rsdesc field and add comments explaining why. Lastly, add an include into access/nbtree/nbtsort.c for utils/sortsupport.h, which was evidently missed due to the above mess. Pointed out by Tom in 16970.1415838651@sss.pgh.pa.us; note that the concerns regarding a similar situation in the custom-path commit still need to be addressed.	2014-11-14 17:05:17 -05:00
Alvaro Herrera	86cf9a5650	Reduce disk footprint of brin regression test Per complaint from Tom. While at it, throw in some extra tests for nulls as well, and make sure that the set of data we insert on the second round is not identical to the first one. Both measures are intended to improve coverage of the test. Also uncomment the ON COMMIT DROP clause on the CREATE TEMP TABLE commands. This doesn't have any effect for someone examining the regression database after the tests are done, but it reduces clutter for those that execute the script directly.	2014-11-14 16:31:48 -03:00
Alvaro Herrera	51f9ea25dc	Allow interrupting GetMultiXactIdMembers This function has a loop which can lead to uninterruptible process "stalls" (actually infinite loops) when some bugs are triggered. Avoid that unpleasant situation by adding a check for interrupts in a place that shouldn't degrade performance in the normal case. Backpatch to 9.3. Older branches have an identical loop here, but the aforementioned bugs are only a problem starting in 9.3 so there doesn't seem to be any point in backpatching any further.	2014-11-14 15:14:01 -03:00
Andres Freund	0c5af0a537	Move BufferGetBlockNumber() out of heap_page_is_all_visible()'s inner loop. In some workloads BufferGetBlockNumber() shows up in profiles due to the sheer number of calls to it (and because it causes cache misses). The compiler can't move it out of the loop because it's a full extern function call...	2014-11-14 17:04:44 +01:00
Andres Freund	6c878edc1d	Add valgrind suppression for pg_atomic_init_u64. pg_atomic_init_u64 (indirectly) uses compare/exchange to guarantee atomic writes on platforms where compare/exchange is available, but 64bit writes aren't atomic (yes, those exist). That leads to a harmless read of the initial value of variable.	2014-11-14 16:59:33 +01:00
Peter Eisentraut	a15d387c22	Improve logical decoding log messages suggestions from Robert Haas	2014-11-13 20:44:34 -05:00
Andres Freund	473f162ce1	Adapt valgrind.supp to the XLogInsert() split. The CRC computation now happens in XLogInsertRecord(), not XLogInsert() itself anymore.	2014-11-14 00:59:40 +01:00
Tom Lane	be09ceb218	Fix pg_dumpall to restore its ability to dump from ancient servers. Fix breakage induced by commits `d8d3d2a4f3` and 463f2625a5fb183b6a8925ccde98bb3889f921d9: pg_dumpall has crashed when attempting to dump from pre-8.1 servers since then, due to faulty construction of the query used for dumping roles from older servers. The query was erroneous as of the earlier commit, but it wasn't exposed unless you tried to use --binary-upgrade, which you presumably wouldn't with a pre-8.1 server. However commit `463f2625a` made it fail always. In HEAD, also fix additional breakage induced in the same query by commit `491c029dbc`, which evidently wasn't tested against pre-8.1 servers either. The bug is only latent in 9.1 because `463f2625a` hadn't landed yet, but it seems best to back-patch all branches containing the faulty query. Gilles Darold	2014-11-13 18:19:26 -05:00
Andres Freund	89fd41b390	Fix and improve cache invalidation logic for logical decoding. There are basically three situations in which logical decoding needs to perform cache invalidation. During/After replaying a transaction with catalog changes, when skipping a uninteresting transaction that performed catalog changes and when erroring out while replaying a transaction. Unfortunately these three cases were all done slightly differently - partially because `8de3e410fa`, which greatly simplifies matters, got committed in the midst of the development of logical decoding. The actually problematic case was when logical decoding skipped transaction commits (and thus processed invalidations). When used via the SQL interface cache invalidation could access the catalog - bad, because we didn't set up enough state to allow that correctly. It'd not be hard to setup sufficient state, but the simpler solution is to always perform cache invalidation outside a valid transaction. Also make the different cache invalidation cases look as similar as possible, to ease code review. This fixes the assertion failure reported by Antonin Houska in 53EE02D9.7040702@gmail.com. The presented testcase has been expanded into a regression test. Backpatch to 9.4, where logical decoding was introduced.	2014-11-13 20:34:31 +01:00
Andres Freund	5a2c184058	Fix xmin/xmax horizon computation during logical decoding initialization. When building the initial historic catalog snapshot there were scenarios where snapbuild.c would use incorrect xmin/xmax values when starting from a xl_running_xacts record. The values used were always a bit suspect, but happened to be correct in the easy to test cases. Notably the values used when the the initial snapshot was computed while no other transactions were running were correct. This is likely to be the cause of the occasional buildfarm failures on animals markhor and tick; but it's quite possible to reproduce problems without CLOBBER_CACHE_ALWAYS. Backpatch to 9.4, where logical decoding was introduced.	2014-11-13 20:34:30 +01:00
Heikki Linnakangas	81c4508196	Fix race condition between hot standby and restoring a full-page image. There was a window in RestoreBackupBlock where a page would be zeroed out, but not yet locked. If a backend pinned and locked the page in that window, it saw the zeroed page instead of the old page or new page contents, which could lead to missing rows in a result set, or errors. To fix, replace RBM_ZERO with RBM_ZERO_AND_LOCK, which atomically pins, zeroes, and locks the page, if it's not in the buffer cache already. In stable branches, the old RBM_ZERO constant is renamed to RBM_DO_NOT_USE, to avoid breaking any 3rd party extensions that might use RBM_ZERO. More importantly, this avoids renumbering the other enum values, which would cause even bigger confusion in extensions that use ReadBufferExtended, but haven't been recompiled. Backpatch to all supported versions; this has been racy since hot standby was introduced.	2014-11-13 20:02:37 +02:00
Robert Haas	c0828b78e9	Move the guts of our Levenshtein implementation into core. The hope is that we can use this to produce better diagnostics in some cases. Peter Geoghegan, reviewed by Michael Paquier, with some further changes by me.	2014-11-13 12:33:26 -05:00
Heikki Linnakangas	34402ae351	Fix XLogReadBufferForRedoExtended to get cleanup lock when asked to do so.	2014-11-13 17:54:20 +02:00
Fujii Masao	c291503b1c	Rename pending_list_cleanup_size to gin_pending_list_limit. Since this parameter is only for GIN index, it's better to add "gin" to the parameter name for easier understanding.	2014-11-13 12:14:48 +09:00
Tom Lane	677708032c	Explicitly support the case that a plancache's raw_parse_tree is NULL. This only happens if a client issues a Parse message with an empty query string, which is a bit odd; but since it is explicitly called out as legal by our FE/BE protocol spec, we'd probably better continue to allow it. Fix by adding tests everywhere that the raw_parse_tree field is passed to functions that don't or shouldn't accept NULL. Also make it clear in the relevant comments that NULL is an expected case. This reverts commits `a73c9dbab0` and `2e9650cbcf`, which fixed specific crash symptoms by hacking things at what now seems to be the wrong end, ie the callee functions. Making the callees allow NULL is superficially more robust, but it's not always true that there is a defensible thing for the callee to do in such cases. The caller has more context and is better able to decide what the empty-query case ought to do. Per followup discussion of bug #11335. Back-patch to 9.2. The code before that is sufficiently different that it would require development of a separate patch, which doesn't seem worthwhile for what is believed to be an essentially cosmetic change.	2014-11-12 15:59:01 -05:00
Andres Freund	ec5896aed3	Fix several weaknesses in slot and logical replication on-disk serialization. Heikki noticed in 544E23C0.8090605@vmware.com that slot.c and snapbuild.c were missing the FIN_CRC32 call when computing/checking checksums of on disk files. That doesn't lower the the error detection capabilities of the checksum, but is inconsistent with other usages. In a followup mail Heikki also noticed that, contrary to a comment, the 'version' and 'length' struct fields of replication slot's on disk data where not covered by the checksum. That's not likely to lead to actually missed corruption as those fields are cross checked with the expected version and the actual file length. But it's wrong nonetheless. As fixing these issues makes existing on disk files unreadable, bump the expected versions of on disk files for both slots and logical decoding historic catalog snapshots. This means that loading old files will fail with ERROR: "replication slot file ... has unsupported version 1" and ERROR: "snapbuild state file ... has unsupported version 1 instead of 2" respectively. Given the low likelihood of anybody already using these new features in a production setup that seems acceptable. Fixing these issues made me notice that there's no regression test covering the loading of historic snapshot from disk - so add one. Backpatch to 9.4 where these features were introduced.	2014-11-12 18:52:49 +01:00
Noah Misch	28245b8424	Use just one database connection in the "tablespace" test. On Windows, DROP TABLESPACE has a race condition when run concurrently with other processes having opened files in the tablespace. This led to a rare failure on buildfarm member frogmouth. Back-patch to 9.4, where the reconnection was introduced.	2014-11-12 07:33:17 -05:00
Peter Eisentraut	8339f33d68	Message improvements	2014-11-11 20:02:30 -05:00
Robert Haas	f1abd78be7	Remove incorrect comment. This was introduced by commit `5ea86e6e65`. Peter Geoghegan	2014-11-11 18:41:29 -05:00
Tom Lane	2edfc021c6	Fix dependency searching for case where column is visited before table. When the recursive search in dependency.c visits a column and then later visits the whole table containing the column, it needs to propagate the drop-context flags for the table to the existing target-object entry for the column. Otherwise we might refuse the DROP (if not CASCADE) on the incorrect grounds that there was no automatic drop pathway to the column. Remarkably, this has not been reported before, though it's possible at least when an extension creates both a datatype and a table using that datatype. Rather than just marking the column as allowed to be dropped, it might seem good to skip the DROP COLUMN step altogether, since the later DROP of the table will surely get the job done. The problem with that is that the datatype would then be dropped before the table (since the whole situation occurred because we visited the datatype, and then recursed to the dependent column, before visiting the table). That seems pretty risky, and the case is rare enough that it doesn't seem worth expending a lot of effort or risk to make the drops happen in a safe order. So we just play dumb and delete the column separately according to the existing drop ordering rules. Per report from Petr Jelinek, though this is different from his proposed patch. Back-patch to 9.1, where extensions were introduced. There's currently no evidence that such cases can arise before 9.1, and in any case we would also need to back-patch `cb5c2ba2d8` to 9.0 if we wanted to back-patch this.	2014-11-11 17:00:11 -05:00
Fujii Masao	1871c89202	Add generate_series(numeric, numeric). Платон Малюгин Reviewed by Michael Paquier, Ali Akbar and Marti Raudsepp	2014-11-11 21:44:46 +09:00
Fujii Masao	a1b395b6a2	Add GUC and storage parameter to set the maximum size of GIN pending list. Previously the maximum size of GIN pending list was controlled only by work_mem. But the reasonable value of work_mem and the reasonable size of the list are basically not the same, so it was not appropriate to control both of them by only one GUC, i.e., work_mem. This commit separates new GUC, pending_list_cleanup_size, from work_mem to allow users to control only the size of the list. Also this commit adds pending_list_cleanup_size as new storage parameter to allow users to specify the size of the list per index. This is useful, for example, when users want to increase the size of the list only for the GIN index which can be updated heavily, and decrease it otherwise. Reviewed by Etsuro Fujita.	2014-11-11 21:08:21 +09:00
Heikki Linnakangas	ae667f778d	Really fix compilation failure on MIPS. I missed an additional colon in previous patch. Oops. to make that mistake less likely in the future, add comments as placeholders for unused inputs and outputs in inline assembly.	2014-11-11 10:25:22 +02:00
Heikki Linnakangas	baf7b3a503	Fix compilation failure on MIPS. Rémi Zara	2014-11-11 01:06:06 +02:00

... 7 8 9 10 11 ...

26748 Commits