Commit Graph

39484 Commits

Author SHA1 Message Date
Tom Lane 3c93a60f60 Add some more defenses against silly estimates to gincostestimate().
A report from Andy Colson showed that gincostestimate() was not being
nearly paranoid enough about whether to believe the statistics it finds in
the index metapage.  The problem is that the metapage stats (other than the
pending-pages count) are only updated by VACUUM, and in the worst case
could still reflect the index's original empty state even when it has grown
to many entries.  We attempted to deal with that by scaling up the stats to
match the current index size, but if nEntries is zero then scaling it up
still gives zero.  Moreover, the proportion of pages that are entry pages
vs. data pages vs. pending pages is unlikely to be estimated very well by
scaling if the index is now orders of magnitude larger than before.

We can improve matters by expanding the use of the rule-of-thumb estimates
I introduced in commit 7fb008c5ee59b040: if the index has grown by more
than a cutoff amount (here set at 4X growth) since VACUUM, then use the
rule-of-thumb numbers instead of scaling.  This might not be exactly right
but it seems much less likely to produce insane estimates.

I also improved both the scaling estimate and the rule-of-thumb estimate
to account for numPendingPages, since it's reasonable to expect that that
is accurate in any case, and certainly pages that are in the pending list
are not either entry or data pages.

As a somewhat separate issue, adjust the estimation equations that are
concerned with extra fetches for partial-match searches.  These equations
suppose that a fraction partialEntries / numEntries of the entry and data
pages will be visited as a consequence of a partial-match search.  Now,
it's physically impossible for that fraction to exceed one, but our
estimate of partialEntries is mostly bunk, and our estimate of numEntries
isn't exactly gospel either, so we could arrive at a silly value.  In the
example presented by Andy we were coming out with a value of 100, leading
to insane cost estimates.  Clamp the fraction to one to avoid that.

Like the previous patch, back-patch to all supported branches; this
problem can be demonstrated in one form or another in all of them.
2016-01-01 13:42:21 -05:00
Noah Misch 3cd1ba147e Fix comments about WAL rule "write xlog before data" versus pg_multixact.
Recovery does not achieve its goal of zeroing all pg_multixact entries
whose accompanying WAL records never reached disk.  Remove that claim
and justify its expendability.  Detail the need for TrimMultiXact(),
which has little in common with the TrimCLOG() rationale.  Merge two
tightly-related comments.  Stop presenting pg_multixact as specific to
heap_lock_tuple(); PostgreSQL 9.3 extended its use to heap_update().

Noticed while investigating a report from Andres Freund.
2016-01-01 01:46:46 -05:00
Peter Eisentraut 253de19b84 doc: Remove redundant duplicate URLs from ulink elements
Empty ulink elements default to displaying the URL, so there is no need
to specify the URL again.  This was already done for most occurrences,
but some cases didn't follow this convention.
2015-12-31 22:26:57 -05:00
Peter Eisentraut 805ac78aab doc: Add index entries and better documentation link for Linux OOM 2015-12-31 22:03:13 -05:00
Tom Lane 5f36096b77 Add a comment noting that FDWs don't have to implement EXCEPT or LIMIT TO.
postgresImportForeignSchema pays attention to IMPORT's EXCEPT and LIMIT TO
options, but only as an efficiency hack, not for correctness' sake.  The
FDW documentation does explain that, but someone using postgres_fdw.c
as a coding guide might not remember it, so let's add a comment here.
Per question from Regina Obe.
2015-12-31 17:59:10 -05:00
Tom Lane 0dab5ef39b Fix ALTER OPERATOR to update dependencies properly.
Fix an oversight in commit 321eed5f0f7563a0: replacing an operator's
selectivity functions needs to result in a corresponding update in
pg_depend.  We have a function that can handle that, but it was not
called by AlterOperator().

To fix this without enlarging pg_operator.h's #include list beyond
what clients can safely include, split off the function definitions
into a new file pg_operator_fn.h, similarly to what we've done for
some other catalog header files.  It's not entirely clear whether
any client-side code needs to include pg_operator.h, but it seems
prudent to assume that there is some such code somewhere.
2015-12-31 17:37:31 -05:00
Tom Lane e5d06f2b12 Dept of second thoughts: the !scan_all exit mustn't increase scanned_pages.
In the extreme edge case where contended pages are the only ones that
escape being scanned, the previous commit would have allowed us to think
that relfrozenxid could be advanced, which is exactly wrong.
2015-12-30 17:32:23 -05:00
Tom Lane e842908233 Avoid useless truncation attempts during VACUUM.
VACUUM can skip heap pages altogether when there's a run of consecutive
pages that are all-visible according to the visibility map.  This causes it
to not update its nonempty_pages count, just as if those pages were empty,
which means that at the end we will think they are candidates for deletion.
Thus, we may take the table's AccessExclusive lock only to find that no
pages are really truncatable.  This usually causes no real problems on a
master server, thanks to the lock being acquired only conditionally; but on
hot-standby servers, the same lock must be acquired unconditionally which
can result in unnecessary query cancellations.

To improve matters, force examination of the table's last page whenever
we reach there with a nonempty_pages count that would allow a truncation
attempt.  If it's not empty, we'll advance nonempty_pages and thereby
prevent the truncation attempt.

If we are unable to acquire cleanup lock on that page, there's no need to
force it, unless we're doing an anti-wraparound vacuum.  We can just check
for tuples with a shared buffer lock and then give up.  (When we are doing
an anti-wraparound vacuum, and decide it's okay to skip the page because it
contains no freezable tuples, this patch still improves matters because
nonempty_pages is properly updated, which it was not before.)

Since only the last page is special-cased in this way, we might attempt a
truncation that will release many fewer pages than the normal heuristic
would suggest; at worst, only one page would be truncated.  But that seems
all right, because the situation won't repeat during the next vacuum.
The real problem with the old logic is that the useless truncation attempt
happens every time we vacuum, so long as the state of the last few dozen
pages doesn't change.

This is a longstanding deficiency, but since the consequences aren't very
severe in most scenarios, I'm not going to risk a back-patch.

Jeff Janes and Tom Lane
2015-12-30 17:13:15 -05:00
Tom Lane e5e5267a91 Minor hacking on contrib/cube documentation.
Improve markup, particularly of the table of functions; add or improve
examples for some of the functions; wordsmith some of the function
descriptions.
2015-12-29 21:21:04 -05:00
Tom Lane efe4c9d704 Add some comments about division of labor between rewriter and planner.
The rationale for the way targetlist processing is done wasn't clearly
stated anywhere, and I for one had forgotten some of the details.  Having
just painfully re-learned them, add some breadcrumbs for the next person.
2015-12-29 18:50:35 -05:00
Tom Lane fd19525756 Put back one copyObject() in rewriteTargetView().
Commit 6f8cb1e234 tried to centralize rewriteTargetView's copying
of a target view's Query struct.  However, it ignored the fact that the
jointree->quals field was used twice.  This only accidentally failed to
fail immediately because the same ChangeVarNodes mutation is applied in
both cases, so that we end up with logically identical expression trees
for both uses (and, as the code stands, the second ChangeVarNodes call
actually does nothing).  However, we end up linking *physically*
identical expression trees into both an RTE's securityQuals list and
the WithCheckOption list.  That's pretty dangerous, mainly because
prepsecurity.c is utterly cavalier about further munging such structures
without copying them first.

There may be no live bug in HEAD as a consequence of the fact that we apply
preprocess_expression in between here and prepsecurity.c, and that will
make a copy of the tree anyway.  Or it may just be that the regression
tests happen to not trip over it.  (I noticed this only because things
fell over pretty badly when I tried to relocate the planner's call of
expand_security_quals to before expression preprocessing.)  In any case
it's very fragile because if anyone tried to make the securityQuals and
WithCheckOption trees diverge before we reach preprocess_expression, it
would not work.  The fact that the current code will preprocess
securityQuals and WithCheckOptions lists at completely different times in
different query levels does nothing to increase my trust that that can't
happen.

In view of the fact that 9.5.0 is almost upon us and the aforesaid commit
has seen exactly zero field testing, the prudent course is to make an extra
copy of the quals so that the behavior is not different from what has been
in the field during beta.
2015-12-29 16:45:47 -05:00
Joe Conway 241448b23a Rename (new|old)estCommitTs to (new|old)estCommitTsXid
The variables newestCommitTs and oldestCommitTs sound as if they are
timestamps, but in fact they are the transaction Ids that correspond
to the newest and oldest timestamps rather than the actual timestamps.
Rename these variables to reflect that they are actually xids: to wit
newestCommitTsXid and oldestCommitTsXid respectively. Also modify
related code in a similar fashion, particularly the user facing output
emitted by pg_controldata and pg_resetxlog.

Complaint and patch by me, review by Tom Lane and Alvaro Herrera.
Backpatch to 9.5 where these variables were first introduced.
2015-12-28 12:34:11 -08:00
Tom Lane 81ee726d87 Code and docs review for cube kNN support.
Commit 33bd250f6c could have done with
some more review:

Adjust coding so that compilers unfamiliar with elog/ereport don't complain
about uninitialized values.

Fix misuse of PG_GETARG_INT16 to retrieve arguments declared as "integer"
at the SQL level.  (This was evidently copied from cube_ll_coord and
cube_ur_coord, but those were wrong too.)

Fix non-style-guide-conforming error messages.

Fix underparenthesized if statements, which pgindent would have made a
hash of, and remove some unnecessary parens elsewhere.

Run pgindent over new code.

Revise documentation: repeated accretion of more operators without any
rethinking of the text already there had left things in a bit of a mess.
Merge all the cube operators into one table and adjust surrounding text
appropriately.

David Rowley and Tom Lane
2015-12-28 14:39:12 -05:00
Alvaro Herrera ac443d1034 Document brin_summarize_new_pages
Pointer out by Jeff Janes
2015-12-28 15:28:19 -03:00
Tom Lane 54aaafe95f Document the exponentiation operator as associating left to right.
Common mathematical convention is that exponentiation associates right to
left.  We aren't going to change the parser for this, but we could note
it in the operator's description.  (It's already noted in the operator
precedence/associativity table, but users might not look there.)
Per bug #13829 from Henrik Pauli.
2015-12-28 12:09:00 -05:00
Tom Lane 870df2b3b7 Fix omission of -X (--no-psqlrc) in some psql invocations.
As of commit d5563d7df, psql -c no longer implies -X, but not all of
our regression testing scripts had gotten that memo.

To ensure consistency of results across different developers, make
sure that *all* invocations of psql in all scripts in our tree
use -X, even where this is not what previously happened.

Michael Paquier and Tom Lane
2015-12-28 11:46:43 -05:00
Alvaro Herrera 151c4ffe41 doc: pg_committs -> pg_commit_ts
Reported by: Alain Laporte (#13836)
2015-12-28 13:45:03 -03:00
Tom Lane 731dfc7d5f Update documentation about pseudo-types.
Tone down an overly strong statement about which pseudo-types PLs are
likely to allow.  Add "event_trigger" to the list, as well as
"pg_ddl_command" in 9.5/HEAD.  Back-patch to 9.3 where event_trigger
was added.
2015-12-28 11:04:42 -05:00
Alvaro Herrera fc995bfdbf Fix translation domain in pg_basebackup
For some reason, we've been overlooking the fact that pg_receivexlog
and pg_recvlogical are using wrong translation domains all along,
so their output hasn't ever been translated.  The right domain is
pg_basebackup, not their own executable names.

Noticed by Ioseph Kim, who's been working on the Korean translation.

Backpatch pg_receivexlog to 9.2 and pg_recvlogical to 9.4.
2015-12-28 10:50:35 -03:00
Alvaro Herrera 743229a67e Add forgotten CHECK_FOR_INTERRUPT calls in pgcrypto's crypt()
Both Blowfish and DES implementations of crypt() can take arbitrarily
long time, depending on the number of rounds specified by the caller;
make sure they can be interrupted.

Author: Andreas Karlsson
Reviewer: Jeff Janes

Backpatch to 9.1.
2015-12-27 13:03:19 -03:00
Tom Lane fec1ad94df Include typmod when complaining about inherited column type mismatches.
MergeAttributes() rejects cases where columns to be merged have the same
type but different typmod, which is correct; but the error message it
printed didn't show either typmod, which is unhelpful.  Changing this
requires using format_type_with_typemod() in place of TypeNameToString(),
which will have some minor side effects on the way some type names are
printed, but on balance this is an improvement: the old code sometimes
printed one type according to one set of rules and the other type according
to the other set, which could be confusing in its own way.

Oddly, there were no regression test cases covering any of this behavior,
so add some.

Complaint and fix by Amit Langote
2015-12-26 13:41:29 -05:00
Tom Lane 3d2b31e30e Fix brin_summarize_new_values() to check index type and ownership.
brin_summarize_new_values() did not check that the passed OID was for
an index at all, much less that it was a BRIN index, and would fail in
obscure ways if it wasn't (possibly damaging data first?).  It also
lacked any permissions test; by analogy to VACUUM, we should only allow
the table's owner to summarize.

Noted by Jeff Janes, fix by Michael Paquier and me
2015-12-26 12:56:09 -05:00
Fujii Masao 8014c44e82 Improve SECURITY LABEL tab completion
Add DATABASE, EVENT TRIGGER, FOREIGN TABLE, ROLE, and TABLESPACE to
tab completion for SECURITY LABEL.

Kyotaro Horiguchi
2015-12-25 22:56:01 +09:00
Teodor Sigaev 25bfa7efd0 Improve the gin index scan performance in pg_trgm.
Previous coding assumes too pessimistic upper bound of similarity
in consistent method of GIN.

Author: Fornaroli Christophe with comments by me.
2015-12-25 13:05:13 +03:00
Tom Lane a9246fbf66 Remove unnecessary row ordering dependency in pg_rewind test suite.
t/002_databases.pl was expecting to see a specific physical order of the
rows in pg_database.  I broke that in HEAD with commit 01e386a325,
but I'd say it's a pretty fragile test methodology in any case, so fix
it in 9.5 as well.
2015-12-24 11:38:31 -05:00
Tom Lane 71dd092c01 Docs: fix erroneously-given function name.
pg_replication_session_is_setup() exists nowhere; apparently this is
meant to refer to pg_replication_origin_session_is_setup().

Adrien Nayrat
2015-12-24 10:50:03 -05:00
Tom Lane 96cd61a169 Fix factual and grammatical errors in comments for struct _tableInfo.
Amit Langote, further adjusted by me
2015-12-24 10:42:58 -05:00
Tom Lane bee172fcd5 Docs typo fix.
Michael Paquier
2015-12-24 10:23:44 -05:00
Tom Lane 01e386a325 Avoid VACUUM FULL altogether in initdb.
Commit ed7b3b3811 purported to remove initdb's use of VACUUM FULL,
as had been agreed to in a pghackers discussion back in Dec 2014.
But it missed this one ...
2015-12-23 20:09:01 -05:00
Tom Lane ff402ae11b Improve handling of password reuse in src/bin/scripts programs.
This reverts most of commit 83dec5a71 in favor of having connectDatabase()
store the possibly-reusable password in a static variable, similar to the
coding we've had for a long time in pg_dump's version of that function.
To avoid possible problems with unwanted password reuse, make callers
specify whether it's reasonable to attempt to re-use the password.
This is a wash for cases where re-use isn't needed, but it is far simpler
for callers that do want that.  Functionally there should be no difference.

Even though we're past RC1, it seems like a good idea to back-patch this
into 9.5, like the prior commit.  Otherwise, if there are any third-party
users of connectDatabase(), they'll have to deal with an API change in
9.5 and then another one in 9.6.

Michael Paquier
2015-12-23 15:45:43 -05:00
Tom Lane 1aa41e3eae In pg_dump, remember connection passwords no matter how we got them.
When pg_dump prompts the user for a password, it remembers the password
for possible re-use by parallel worker processes.  However, libpq might
have extracted the password from a connection string originally passed
as "dbname".  Since we don't record the original form of dbname but
break it down to host/port/etc, the password gets lost.  Fix that by
retrieving the actual password from the PGconn.

(It strikes me that this whole approach is rather broken, as it will also
lose other information such as options that might have been present in
the connection string.  But we'll leave that problem for another day.)

In passing, get rid of rather silly use of malloc() for small fixed-size
arrays.

Back-patch to 9.3 where parallel pg_dump was introduced.

Report and fix by Zeus Kronion, adjusted a bit by Michael Paquier and me
2015-12-23 14:25:53 -05:00
Robert Haas bc7fcab5e3 Read from the same worker repeatedly until it returns no tuple.
The original coding read tuples from workers in round-robin fashion,
but performance testing shows that it works much better to read enough
to empty one queue before moving on to the next.  I believe the
reason for this is that, with the old approach, we could easily wake
up a worker repeatedly to write only one new tuple into the shm_mq
each time.  With this approach, by the time the process gets scheduled,
it has a decent chance of being able to fill the entire buffer in
one go.

Patch by me.  Dilip Kumar helped with performance testing.
2015-12-23 14:06:52 -05:00
Robert Haas 51d152f18e Change Gather not to use a physical tlist.
This should have been part of the original commit, but was missed.
Pushing data between processes is expensive, so we definitely want
to project away unneeded columns here, just as we do for other nodes
like Sort and Hash that care about the volume of data.
2015-12-23 13:41:06 -05:00
Peter Eisentraut 30c0c4bf12 Remove unnecessary escaping in C character literals
'\"' is more commonly written simply as '"'.
2015-12-22 22:43:46 -05:00
Tom Lane 6efbded6e4 Allow omitting one or both boundaries in an array slice specifier.
Omitted boundaries represent the upper or lower limit of the corresponding
array subscript.  This allows simpler specification of many common
use-cases.

(Revised version of commit 9246af6799)

YUriy Zhuravlev
2015-12-22 21:05:29 -05:00
Robert Haas 0ba3f3bc65 Comment improvements for abbreviated keys.
Peter Geoghegan and Robert Haas
2015-12-22 13:57:18 -05:00
Robert Haas ccd8f97922 postgres_fdw: Consider requesting sorted data so we can do a merge join.
When use_remote_estimate is enabled, consider adding ORDER BY to the
query we sending to the remote server so that we can use that ordered
data for a merge join.  Commit f18c944b61
arranges to push down the query pathkeys, which seems like the case
mostly likely to be a win, but testing shows this can sometimes win,
too.

For a regular table, we know which indexes are present and therefore
test whether the ordering provided by each such index is useful.  Here,
we take the opposite approach: guess what orderings would be useful if
they could be generated cheaply, and then ask the remote side what those
will cost.

Ashutosh Bapat, with very substantial cosmetic revisions by me.  Also
reviewed by Rushabh Lathia.
2015-12-22 13:46:40 -05:00
Tom Lane f5a4370aea Fix calculation of space needed for parsed words in tab completion.
Yesterday in commit d854118c8, I had a serious brain fade leading me to
underestimate the number of words that the tab-completion logic could
divide a line into.  On input such as "(((((", each character will get
seen as a separate word, which means we do indeed sometimes need more
space for the words than for the original line.  Fix that.
2015-12-21 15:08:56 -05:00
Stephen Frost 6f8cb1e234 Make viewquery a copy in rewriteTargetView()
Rather than expect the Query returned by get_view_query() to be
read-only and then copy bits and pieces of it out, simply copy the
entire structure when we get it.  This addresses an issue where
AcquireRewriteLocks, which is called by acquireLocksOnSubLinks(),
scribbles on the parsetree passed in, which was actually an entry
in relcache, leading to segfaults with certain view definitions.
This also future-proofs us a bit for anyone adding more code to this
path.

The acquireLocksOnSubLinks() was added in commit c3e0ddd40.

Back-patch to 9.3 as that commit was.
2015-12-21 10:34:14 -05:00
Tom Lane 99ccb23092 Remove silly completion for "DELETE FROM tabname ...".
psql offered USING, WHERE, and SET in this context, but SET is not a valid
possibility here.  Seems to have been a thinko in commit f5ab0a14ea
which added DELETE's USING option.
2015-12-20 18:29:51 -05:00
Tom Lane d854118c8d Teach psql's tab completion to consider the entire input string.
Up to now, the tab completion logic has only examined the last few words
of the current input line; "last few" being originally as few as four
words, but lately up to nine words.  Furthermore, it only looked at what
libreadline considers the current line of input, which made it rather
myopic if you split your command across lines.  This was tolerable,
sort of, so long as the match patterns were only designed to consider the
last few words of input; but with the recent addition of HeadMatches()
and Matches() matching rules, we really have to do better if we want
those to behave sanely.

Hence, change the code to break the entire line down into words, and to
include any previous lines in the command buffer along with the active
readline input buffer.

This will be a little bit slower than the previous coding, but some
measurements say that even a query of several thousand characters can be
parsed in a hundred or so microseconds on modern machines; so it's really
not going to be significant for interactive tab completion.  To reduce
the cost some, I arranged to avoid the per-word malloc calls that used
to occur: all the words are now kept in one malloc'd buffer.
2015-12-20 13:28:18 -05:00
Peter Eisentraut 69e7c44fc6 psql: Review of new help output strings 2015-12-20 11:50:04 -05:00
Tom Lane 654218138b Add missing COSTS OFF to EXPLAIN commands in rowsecurity.sql.
Commit e5e11c8cc added a bunch of EXPLAIN statements without COSTS OFF
to the regression tests.  This is contrary to project policy since it
results in unnecessary platform dependencies in the output (it's just
luck that we didn't get buildfarm failures from it).  Per gripe from
Mike Wilson.
2015-12-19 16:55:14 -05:00
Tom Lane d37b816dc9 Adopt a more compact, less error-prone notation for tab completion code.
Replace tests like

    else if (pg_strcasecmp(prev4_wd, "CREATE") == 0 &&
             pg_strcasecmp(prev3_wd, "TRIGGER") == 0 &&
             (pg_strcasecmp(prev_wd, "BEFORE") == 0 ||
              pg_strcasecmp(prev_wd, "AFTER") == 0))

with new notation like this:

    else if (TailMatches4("CREATE", "TRIGGER", MatchAny, "BEFORE|AFTER"))

In addition, provide some macros COMPLETE_WITH_LISTn() to reduce the amount
of clutter needed to specify a small number of predetermined completion
alternatives.

This makes the code substantially more compact: tab-complete.c gets over a
thousand lines shorter in this patch, despite the addition of a couple of
hundred lines of infrastructure for the new notations.  The new way of
specifying match rules seems a whole lot more readable and less
error-prone, too.

There's a lot more that could be done now to make matching faster and more
reliable; for example I suspect that most of the TailMatches() rules should
now be Matches() rules.  That would allow them to be skipped after a single
integer comparison if there aren't the right number of words on the line,
and it would reduce the risk of unintended matches.  But for now, (mostly)
refrain from reworking any match rules in favor of just converting what
we've got into the new notation.

Thomas Munro, reviewed by Michael Paquier, some adjustments by me
2015-12-19 16:03:14 -05:00
Peter Eisentraut 529fd74c09 Fix whitespace 2015-12-19 11:46:38 -05:00
Andres Freund 130d94a7b8 Fix tab completion for ALTER ... TABLESPACE ... OWNED BY.
Previously the completion used the wrong word to match 'BY'. This was
introduced brokenly, in b2de2a. While at it, also add completion of
IN TABLESPACE ... OWNED BY and fix comments referencing nonexistent
syntax.

Reported-By: Michael Paquier
Author: Michael Paquier and Andres Freund
Discussion: CAB7nPqSHDdSwsJqX0d2XzjqOHr==HdWiubCi4L=Zs7YFTUne8w@mail.gmail.com
Backpatch: 9.4, like the commit introducing the bug
2015-12-19 17:37:11 +01:00
Teodor Sigaev bbbd807097 Revert 9246af6799 because
I miss too much. Patch is returned to commitfest process.
2015-12-18 21:35:22 +03:00
Robert Haas 3c7042a7d7 pgbench: Change terminology from "threshold" to "parameter".
Per a recommendation from Tomas Vondra, it's more helpful to refer to
the value that determines how skewed a Gaussian or exponential
distribution is as a parameter rather than a threshold.

Since it's not quite too late to get this right in 9.5, where it was
introduced, back-patch this.  Most of the patch changes only comments
and documentation, but a few pgbench messages are altered to match.

Fabien Coelho, reviewed by Michael Paquier and by me.
2015-12-18 13:24:51 -05:00
Robert Haas 6e7b335930 Remove duplicate word.
Kyotaro Horiguchi
2015-12-18 12:43:52 -05:00
Robert Haas 2bdfcb52c5 Fix TupleQueueReaderNext not to ignore its nowait argument.
This was a silly goof on my (rhaas's) part.

Report and fix by Rushabh Lathia.
2015-12-18 12:37:43 -05:00