Commit Graph

17622 Commits

Author SHA1 Message Date
Tom Lane 08f1e1f0a4 Make setrefs.c match by ressortgroupref even for plain Vars.
Previously, we skipped using search_indexed_tlist_for_sortgroupref()
if the tlist expression being sought in the child plan node was merely
a Var.  This is purely an optimization, based on the theory that
search_indexed_tlist_for_var() is faster, and one copy of a Var should
be as good as another.  However, the GROUPING SETS patch broke the
latter assumption: grouping columns containing the "same" Var can
sometimes have different outputs, as shown in the test case added here.
So do it the hard way whenever a ressortgroupref marking exists.

(If this seems like a bottleneck, we could imagine building a tlist index
data structure for ressortgroupref values, as we do for Vars.  But I'll
let that idea go until there's some evidence it's worthwhile.)

Back-patch to 9.6.  The problem also exists in 9.5 where GROUPING SETS
came in, but this patch is insufficient to resolve the problem in 9.5:
there is some obscure dependency on the upper-planner-pathification
work that happened in 9.6.  Given that this is such a weird corner case,
and no end users have complained about it, it doesn't seem worth the work
to develop a fix for 9.5.

Patch by me, per a report from Heikki Linnakangas.  (This does not fix
Heikki's original complaint, just the follow-on one.)

Discussion: https://postgr.es/m/aefc657e-edb2-64d5-6df1-a0828f6e9104@iki.fi
2017-10-26 12:17:40 -04:00
Andrew Dunstan adee9e4e31 Undo inadvertent change in capitalization in commit 18fc4ec. 2017-10-26 08:20:00 -04:00
Robert Haas b55509332f In relevant log messages, indicate whether vacuums are aggressive.
Kyotaro Horiguchi, reviewed Masahiko Sawada, David G. Johnston, Álvaro
Herrera, and me.  Grammar correction to the final posted patch by me.

Discussion: http://postgr.es/m/20170329.124649.193656100.horiguchi.kyotaro@lab.ntt.co.jp
2017-10-26 12:38:10 +02:00
Andrew Dunstan 18fc4ecf4a Process variadic arguments consistently in json functions
json_build_object and json_build_array and the jsonb equivalents did not
correctly process explicit VARIADIC arguments. They are modified to use
the new extract_variadic_args() utility function which abstracts away
the details of the call method.

Michael Paquier, reviewed by Tom Lane and Dmitry Dolgov.

Backpatch to 9.5 for the jsonb fixes and 9.4 for the json fixes, as
that's where they originated.
2017-10-25 07:34:00 -04:00
Andrew Dunstan f3c6e8a27a Add a utility function to extract variadic function arguments
This is epecially useful in the case or "VARIADIC ANY" functions. The
caller can get the artguments and types regardless of whether or not and
explicit VARIADIC array argument has been used. The function also
provides an option to convert arguments on type "unknown" to to "text".

Michael Paquier and me, reviewed by Tom Lane.

Backpatch to 9.4 in order to support the following json bug fix.
2017-10-25 07:13:11 -04:00
Tom Lane 896eb5efbd In the planner, delete joinaliasvars lists after we're done with them.
Although joinaliasvars lists coming out of the parser are quite simple,
those lists can contain arbitrarily complex expressions after subquery
pullup.  We do not perform expression preprocessing on them, meaning that
expressions in those lists will not meet the expectations of later phases
of the planner (for example, that they do not contain SubLinks).  This had
been thought pretty harmless, since we don't intentionally touch those
lists in later phases --- but Andreas Seltenreich found a case in which
adjust_appendrel_attrs() could recurse into a joinaliasvars list and then
die on its assertion that it never sees a SubLink.  We considered a couple
of localized fixes to prevent that specific case from looking at the
joinaliasvars lists, but really this seems like a generic hazard for all
expression processing in the planner.  Therefore, probably the best answer
is to delete the joinaliasvars lists from the parsetree at the end of
expression preprocessing, so that there are no reachable expressions that
haven't been through preprocessing.

The case Andreas found seems to be harmless in non-Assert builds, and so
far there are no field reports suggesting that there are user-visible
effects in other cases.  I considered back-patching this anyway, but
it turns out that Andreas' test doesn't fail at all in 9.4-9.6, because
in those versions adjust_appendrel_attrs contains code (added in commit
842faa714 and removed again in commit 215b43cdc) to process SubLinks
rather than complain about them.  Barring discovery of another path by
which unprocessed joinaliasvars lists can cause trouble, the most
prudent compromise seems to be to patch this into v10 but not further.

Patch by me, with thanks to Amit Langote for initial investigation
and review.

Discussion: https://postgr.es/m/87r2tvt9f1.fsf@ansel.ydns.eu
2017-10-24 18:42:47 -04:00
Tom Lane f3ea3e3e82 Fix some oversights in expression dependency recording.
find_expr_references() neglected to record a dependency on the result type
of a FieldSelect node, allowing a DROP TYPE to break a view or rule that
contains such an expression.  I think we'd omitted this case intentionally,
reasoning that there would always be a related dependency ensuring that the
DROP would cascade to the view.  But at least with nested field selection
expressions, that's not true, as shown in bug #14867 from Mansur Galiev.
Add the dependency, and for good measure a dependency on the node's exposed
collation.

Likewise add a dependency on the result type of a FieldStore.  I think here
the reasoning was that it'd only appear within an assignment to a field,
and the dependency on the field's column would be enough ... but having
seen this example, I think that's wrong for nested-composites cases.

Looking at nearby code, I notice we're not recording a dependency on the
exposed collation of CoerceViaIO, which seems inconsistent with our choices
for related node types.  Maybe that's OK but I'm feeling suspicious of this
code today, so let's add that; it certainly can't hurt.

This patch does not do anything to protect already-existing views, only
views created after it's installed.  But seeing that the issue has been
there a very long time and nobody noticed till now, that's probably good
enough.

Back-patch to all supported branches.

Discussion: https://postgr.es/m/20171023150118.1477.19174@wrigleys.postgresql.org
2017-10-23 13:57:45 -04:00
Tom Lane 36ea99c84d Fix typcache's failure to treat ranges as container types.
Like the similar logic for arrays and records, it's necessary to examine
the range's subtype to decide whether the range type can support hashing.
We can omit checking the subtype for btree-defined operations, though,
since range subtypes are required to have those operations.  (Possibly
that simplification for btree cases led us to overlook that it does
not apply for hash cases.)

This is only an issue if the subtype lacks hash support, which is not
true of any built-in range type, but it's easy to demonstrate a problem
with a range type over, eg, money: you can get a "could not identify
a hash function" failure when the planner is misled into thinking that
hash join or aggregation would work.

This was born broken, so back-patch to all supported branches.
2017-10-20 17:12:27 -04:00
Tom Lane a8f1efc8ac Fix misimplementation of typcache logic for extended hashing.
The previous coding would report that an array type supports extended
hashing if its element type supports regular hashing.  This bug is
only latent at the moment, since AFAICS there is not yet any code
that depends on checking presence of extended-hashing support to make
any decisions.  (And in any case it wouldn't matter unless the element
type has only regular hashing, which isn't true of any core data type.)
But that doesn't make it less broken.  Extend the
cache_array_element_properties infrastructure to check this properly.
2017-10-20 16:08:17 -04:00
Magnus Hagander 752871b6de Fix typos
David Rowley
2017-10-19 13:58:30 +02:00
Magnus Hagander 275c4be19d Fix typo
Masahiko Sawada
2017-10-19 13:57:20 +02:00
Peter Eisentraut 927e1ee2cb UCS_to_most.pl: Process encodings in sorted order
Otherwise the order depends on the Perl hash implementation, making it
cumbersome to scan the output when debugging.
2017-10-19 05:58:39 -04:00
Tom Lane 7421f4b89a Fix incorrect handling of CTEs and ENRs as DML target relations.
setTargetTable threw an error if the proposed target RangeVar's relname
matched any visible CTE or ENR.  This breaks backwards compatibility in
the CTE case, since pre-v10 we never looked for a CTE here at all, so that
CTE names did not mask regular tables.  It does seem like a good idea to
throw an error for the ENR case, though, thus causing ENRs to mask tables
for this purpose; ENRs are new in v10 so we're not breaking existing code,
and we may someday want to allow them to be the targets of DML.

To fix that, replace use of getRTEForSpecialRelationTypes, which was
overkill anyway, with use of scanNameSpaceForENR.

A second problem was that the check neglected to verify null schemaname,
so that a CTE or ENR could incorrectly be thought to match a qualified
RangeVar.  That happened because getRTEForSpecialRelationTypes relied
on its caller to have checked for null schemaname.  Even though the one
remaining caller got it right, this is obviously bug-prone, so move
the check inside getRTEForSpecialRelationTypes.

Also, revert commit 18ce3a4ab's extremely poorly thought out decision to
add a NULL return case to parserOpenTable --- without either documenting
that or adjusting any of the callers to check for it.  The current bug
seems to have arisen in part due to working around that bad idea.

In passing, remove the one-line shim functions transformCTEReference and
transformENRReference --- they don't seem to be adding any clarity or
functionality.

Per report from Hugo Mercier (via Julien Rouhaud).  Back-patch to v10
where the bug was introduced.

Thomas Munro, with minor editing by me

Discussion: https://postgr.es/m/CAOBaU_YdPVH+PTtiKSSLOiiW3mVDYsnNUekK+XPbHXiP=wrFLA@mail.gmail.com
2017-10-16 17:56:54 -04:00
Peter Eisentraut 4211673622 Exclude flex-generated code from coverage testing
Flex generates a lot of functions that are not actually used.  In order
to avoid coverage figures being ruined by that, mark up the part of the
.l files where the generated code appears by lcov exclusion markers.
That way, lcov will typically only reported on coverage for the .l file,
which is under our control, but not for the .c file.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-10-16 16:28:11 -04:00
Tom Lane cf5ba7c30c Treat aggregate direct arguments as per-agg data not per-trans data.
There is no reason to insist that direct arguments must match before
we can merge transition states of two aggregate calls.  They're only
used during the finalfn call, so we can treat them as like the finalfn
itself.  This allows, eg, merging of

select
  percentile_cont(0.25) within group (order by a),
  percentile_disc(0.5) within group (order by a)
from ...

This didn't matter (and could not have been tested) before we allowed
state merging of OSAs otherwise.

Discussion: https://postgr.es/m/CAB4ELO5RZhOamuT9Xsf72ozbenDLLXZKSk07FiSVsuJNZB861A@mail.gmail.com
2017-10-16 16:02:51 -04:00
Tom Lane be0ebb65f5 Allow the built-in ordered-set aggregates to share transition state.
The built-in OSAs all share the same transition function, so they can
share transition state as long as the final functions cooperate to not
do the sort step more than once.  To avoid running the tuplesort object
in randomAccess mode unnecessarily, add a bit of infrastructure to
nodeAgg.c to let the aggregate functions find out whether the transition
state is actually being shared or not.

This doesn't work for the hypothetical aggregates, since those inject
a hypothetical row that isn't traceable to the shared input state.
So they remain marked aggfinalmodify = 'w'.

Discussion: https://postgr.es/m/CAB4ELO5RZhOamuT9Xsf72ozbenDLLXZKSk07FiSVsuJNZB861A@mail.gmail.com
2017-10-16 15:51:23 -04:00
Tom Lane c3dfe0fec0 Repair breakage of aggregate FILTER option.
An aggregate's input expression(s) are not supposed to be evaluated
at all for a row where its FILTER test fails ... but commit 8ed3f11bb
overlooked that requirement.  Reshuffle so that aggregates having a
filter clause evaluate their arguments separately from those without.
This still gets the benefit of doing only one ExecProject in the
common case of multiple Aggrefs, none of which have filters.

While at it, arrange for filter clauses to be included in the common
ExecProject evaluation, thus perhaps buying a little bit even when
there are filters.

Back-patch to v10 where the bug was introduced.

Discussion: https://postgr.es/m/30065.1508161354@sss.pgh.pa.us
2017-10-16 15:24:36 -04:00
Alvaro Herrera 60a1d96ed7 Rework DefineIndex relkind check
Simplify coding using a switch rather than nested if tests.

Author: Álvaro
Reviewed-by: Robert Haas, Amit Langote, Michaël Paquier
Discussion: https://postgr.es/m/20171013163820.pai7djcaxrntaxtn@alvherre.pgsql
2017-10-16 12:22:18 +02:00
Tom Lane 5fc438fb25 Restore nodeAgg.c's ability to check for improperly-nested aggregates.
While poking around in the aggregate logic, I noticed that commit
8ed3f11bb broke the logic in nodeAgg.c that purports to detect nested
aggregates, by moving initialization of regular aggregate argument
expressions out of the code segment that checks for that.

You could argue that this check is unnecessary, but it's not much code
so I'm inclined to keep it as a backstop against parser and planner
bugs.  However, there's certainly zero value in checking only some of
the subexpressions.

We can make the check complete again, and as a bonus make it a good
deal more bulletproof against future mistakes of the same ilk, by
moving it out to the outermost level of ExecInitAgg.  This means we
need to check only once per Agg node not once per aggregate, which
also seems like a good thing --- if the check does find something
wrong, it's not urgent that we report it before the plan node
initialization finishes.

Since this requires remembering the original length of the aggs list,
I deleted a long-obsolete stanza that changed numaggs from 0 to 1.
That's so old it predates our decision that palloc(0) is a valid
operation, in (digs...) 2004, see commit 24a1e20f1.

In passing improve a few comments.

Back-patch to v10, just in case.
2017-10-15 19:19:18 -04:00
Tom Lane 4de2d4fba3 Explicitly track whether aggregate final functions modify transition state.
Up to now, there's been hard-wired assumptions that normal aggregates'
final functions never modify their transition states, while ordered-set
aggregates' final functions always do.  This has always been a bit
limiting, and in particular it's getting in the way of improving the
built-in ordered-set aggregates to allow merging of transition states.
Therefore, let's introduce catalog and CREATE AGGREGATE infrastructure
that lets the finalfn's behavior be declared explicitly.

There are now three possibilities for the finalfn behavior: it's purely
read-only, it trashes the transition state irrecoverably, or it changes
the state in such a way that no more transfn calls are possible but the
state can still be passed to other, compatible finalfns.  There are no
examples of this third case today, but we'll shortly make the built-in
OSAs act like that.

This change allows user-defined aggregates to explicitly disclaim support
for use as window functions, and/or to prevent transition state merging,
if their implementations cannot handle that.  While it was previously
possible to handle the window case with a run-time error check, there was
not any way to prevent transition state merging, which in retrospect is
something commit 804163bc2 should have provided for.  But better late
than never.

In passing, split out pg_aggregate.c's extern function declarations into
a new header file pg_aggregate_fn.h, similarly to what we've done for
some other catalog headers, so that pg_aggregate.h itself can be safe
for frontend files to include.  This lets pg_dump use the symbolic
names for relevant constants.

Discussion: https://postgr.es/m/4834.1507849699@sss.pgh.pa.us
2017-10-14 15:21:39 -04:00
Andres Freund 141fd1b66c Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
   more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
   entries - instead store them as a Datum array. This also allows to
   get rid of having to build dummy tuples for negative & list
   entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
   likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
   specific number of attributes allows to unroll loops and avoid
   other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
   greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
   allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
   piece.

This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.

I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
  shows up in profiles. Unfortunately it's not easy to do so safely as
  an entry's memory location can change at various times, which
  doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
  but the win isn't big and the code for it is ugly, because the
  tuples have to be freed as well.
- add more proper functions, rather than macros for
  SearchSysCacheCopyN etc., but right now they don't show up in
  profiles.

The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside.  That might be a good idea
anyway, but it's for another day.

Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 14:22:41 -07:00
Robert Haas 6393613b6a Fix possible crash with Parallel Bitmap Heap Scan.
If a Parallel Bitmap Heap scan's chain of leftmost descendents
includes a BitmapOr whose first child is a BitmapAnd, the prior coding
would mistakenly create a non-shared TIDBitmap and then try to perform
shared iteration.

Report by Tomas Vondra.  Patch by Dilip Kumar.

Discussion: http://postgr.es/m/50e89684-8ad9-dead-8767-c9545bafd3b6@2ndquadrant.com
2017-10-13 15:02:45 -04:00
Tom Lane 73937119bf Improve implementation of CRE-stack-flattening in map_variable_attnos().
I (tgl) objected to the obscure implementation introduced in commit
1c497fa72.  This one seems a bit less action-at-a-distance-y, at the
price of repeating a few lines of code.

Improve the comments about what the function is doing, too.

Amit Khandekar, whacked around a bit more by me

Discussion: https://postgr.es/m/CAJ3gD9egYTyHUH0nTMxm8-1m3RvdqEbaTyGC-CUNtYf7tKNDaQ@mail.gmail.com
2017-10-13 13:43:55 -04:00
Peter Eisentraut 7d1b8e7591 Attempt to fix LDAP build
Apparently, an older spelling of LDAP_OPT_DIAGNOSTIC_MESSAGE is
LDAP_OPT_ERROR_STRING, so fall back to that one.
2017-10-12 23:47:48 -04:00
Peter Eisentraut cf1238cd97 Log diagnostic messages if errors occur during LDAP auth.
Diagnostic messages seem likely to help users diagnose root
causes more easily, so let's report them as errdetail.

Author: Thomas Munro
Reviewed-By: Ashutosh Bapat, Christoph Berg, Alvaro Herrera, Peter Eisentraut
Discussion: https://postgr.es/m/CAEepm=2_dA-SYpFdmNVwvKsEBXOUj=K4ooKovHmvj6jnMdt8dw@mail.gmail.com
2017-10-12 22:37:14 -04:00
Peter Eisentraut 1feff99fe4 Improve LDAP cleanup code in error paths.
After calling ldap_unbind_s() we probably shouldn't try to use the LDAP
connection again to call ldap_get_option(), even if it failed.  The OpenLDAP
man page for ldap_unbind[_s] says "Once it is called, the connection to the
LDAP server is closed, and the ld structure is invalid."  Otherwise, as a
general rule we should probably call ldap_unbind() before returning in all
paths to avoid leaking resources.  It is unlikely there is any practical
leak problem since failure to authenticate currently results in the backend
exiting soon afterwards.

Author: Thomas Munro
Reviewed-By: Alvaro Herrera, Peter Eisentraut
Discussion: https://postgr.es/m/20170914141205.eup4kxzlkagtmfac%40alvherre.pgsql
2017-10-12 22:37:14 -04:00
Robert Haas 1c497fa72d Avoid coercing a whole-row variable that is already coerced.
Marginal efficiency and beautification hack.  I'm not sure whether
this case ever arises currently, but the pending patch for update
tuple routing will cause it to arise.

Amit Khandekar

Discussion: http://postgr.es/m/CAJ3gD9cazfppe7-wwUbabPcQ4_0SubkiPFD1+0r5_DkVNWo5yg@mail.gmail.com
2017-10-12 17:10:48 -04:00
Robert Haas 60f7c0abef Use ResultRelInfo ** rather than ResultRelInfo * for tuple routing.
The previous convention doesn't lend itself to creating ResultRelInfos
lazily, as we already do in ExecGetTriggerResultRel.  This patch
doesn't make anything lazier than before, but the pending patch for
UPDATE tuple routing proposes to do so (and there might be other
opportunities as well).

Amit Khandekar with some adjustments by me.

Discussion: http://postgr.es/m/CA+TgmoYPVP9Lyf6vUFA5DwxS4c--x6LOj2y36BsJaYtp62eXPQ@mail.gmail.com
2017-10-12 16:50:53 -04:00
Tom Lane 305cf1fd72 Fix AggGetAggref() so it won't lie to aggregate final functions.
If we merge the transition calculations for two different aggregates,
it's reasonable to assume that the transition function should not care
which of those Aggref structs it gets from AggGetAggref().  It is not
reasonable to make the same assumption about an aggregate final function,
however.  Commit 804163bc2 broke this, as it will pass whichever Aggref
was first associated with the transition state in both cases.

This doesn't create an observable bug so far as the core system is
concerned, because the only existing uses of AggGetAggref() are in
ordered-set aggregates that happen to not pay attention to anything
but the input properties of the Aggref; and besides that, we disabled
sharing of transition calculations for OSAs yesterday.  Nonetheless,
if some third-party code were using AggGetAggref() in a normal aggregate,
they would be entitled to call this a bug.  Hence, back-patch the fix
to 9.6 where the problem was introduced.

In passing, improve some of the comments about transition state sharing.

Discussion: https://postgr.es/m/CAB4ELO5RZhOamuT9Xsf72ozbenDLLXZKSk07FiSVsuJNZB861A@mail.gmail.com
2017-10-12 15:20:16 -04:00
Robert Haas ad4a7ed099 Synchronize error messages.
Commits 6476b26115
and 14f67a8ee2 didn't use quite the
same error message for what is basically the same situation.

Amit Langote, pared back a bit by me.

Discussion: http://postgr.es/m/54dc76d0-3b5b-ba5a-27dc-fb31a3975b61@lab.ntt.co.jp
2017-10-12 15:14:22 -04:00
Alvaro Herrera e9ef11ac8b Infer functional dependency past RelabelType
Vars hidden within a RelabelType would not be detected as compatible
with some functional dependency.  Repair by properly ignoring the
RelabelType.

Author: David Rowley
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CAKJS1f-y-UEy=rsBXynBOgiW1fKMr_LVoYSGL9QOc36mLEC-ww@mail.gmail.com
2017-10-12 17:23:47 +02:00
Robert Haas 360fd1a7b2 Fix logical replication to fire BEFORE ROW DELETE triggers.
Before, that would fail to happen unless a BEFORE ROW UPDATE trigger
was also present.

Noted by me while reviewing a patch from Masahiko Sawada, who also
wrote this patch.  Reviewed by Petr Jelinek.

Discussion: http://postgr.es/m/CA+TgmobAZvCxduG8y_mQKBK7nz-vhbdLvjM354KEFozpuzMN5A@mail.gmail.com
2017-10-12 10:26:55 -04:00
Andres Freund 31079a4a8e Replace remaining uses of pq_sendint with pq_sendint{8,16,32}.
pq_sendint() remains, so extension code doesn't unnecessarily break.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de
2017-10-11 21:00:46 -07:00
Tom Lane 52328727be Prevent sharing transition states between ordered-set aggregates.
This ought to work, but the built-in OSAs are not capable of coping,
because their final-functions destructively modify their transition
state (specifically, the contained tuplesort object).  That was fine
when those functions were written, but commit 804163bc2 moved the
goalposts without telling orderedsetaggs.c.

We should fix the built-in OSAs to support this, but it will take
a little work, especially if we don't want to sacrifice performance
in the normal non-shared-state case.  Given that it took a year after
9.6 release for anyone to notice this bug, we should not prioritize
sharable-state over nonsharable-state performance.  And a proper fix
is likely to be more complicated than we'd want to back-patch, too.

Therefore, let's just put in this stop-gap patch to prevent nodeAgg.c
from choosing to use shared state for OSAs.  We can revert it in HEAD
when we get a better fix.

Report from Lukas Eder, diagnosis by me, patch by David Rowley.
Back-patch to 9.6 where the problem was introduced.

Discussion: https://postgr.es/m/CAB4ELO5RZhOamuT9Xsf72ozbenDLLXZKSk07FiSVsuJNZB861A@mail.gmail.com
2017-10-11 22:18:10 -04:00
Andres Freund 4c119fbcd4 Improve performance of SendRowDescriptionMessage.
There's three categories of changes leading to better performance:
- Splitting the per-attribute part of SendRowDescriptionMessage into a
  v2 and a v3 version allows avoiding branches for every attribute.
- Preallocating the size of the buffer to be big enough for all
  attributes and then using pq_write* avoids unnecessary buffer
  size checks & resizing.
- Reusing a persistently allocated StringInfo for all
  SendRowDescriptionMessage() invocations avoids repeated allocations
  & reallocations.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de
2017-10-11 17:23:23 -07:00
Robert Haas cff440d368 pg_stat_statements: Widen query IDs from 32 bits to 64 bits.
This takes advantage of the infrastructure introduced by commit
81c5e46c49 to greatly reduce the
likelihood that two different queries will end up with the same query
ID.  It's still possible, of course, but whereas before it the chances
of a collision reached 25% around 50,000 queries, it will now take
more than 3 billion queries.

Backward incompatibility: Because the type exposed at the SQL level is
int8, users may now see negative query IDs in the pg_stat_statements
view (and also, query IDs more than 4 billion, which was the old
limit).

Patch by me, reviewed by Michael Paquier and Peter Geoghegan.

Discussion: http://postgr.es/m/CA+TgmobG_Kp4cBKFmsznUAaM1GWW6hhRNiZC0KjRMOOeYnz5Yw@mail.gmail.com
2017-10-11 19:52:46 -04:00
Andres Freund f2dec34e19 Use one stringbuffer for all rows printed in printtup.c.
This avoids newly allocating, and then possibly growing, the
stringbuffer for every row. For wide rows this can substantially
reduce memory allocator overhead, at the price of not immediately
reducing memory usage after outputting an especially wide row.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de
2017-10-11 16:26:35 -07:00
Andres Freund 1de09ad8eb Add more efficient functions to pqformat API.
There's three prongs to achieve greater efficiency here:

1) Allow reusing a stringbuffer across pq_beginmessage/endmessage,
   with the new pq_beginmessage_reuse/endmessage_reuse. This can be
   beneficial both because it avoids allocating the initial buffer,
   and because it's more likely to already have an correctly sized
   buffer.

2) Replacing pq_sendint() with pq_sendint$width() inline
   functions. Previously unnecessary and unpredictable branches in
   pq_sendint() were needed. Additionally the replacement functions
   are implemented more efficiently.  pq_sendint is now deprecated, a
   separate commit will convert all in-tree callers.

3) Add pq_writeint$width(), pq_writestring(). These rely on sufficient
   space in the StringInfo's buffer, avoiding individual space checks
   & potential individual resizing.  To allow this to be used for
   strings, expose mbutil.c's MAX_CONVERSION_GROWTH.

Followup commits will make use of these facilities.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de
2017-10-11 16:01:52 -07:00
Andres Freund 70c2d1be2b Allow to avoid NUL-byte management for stringinfos and use in format.c.
In a lot of the places having appendBinaryStringInfo() maintain a
trailing NUL byte wasn't actually meaningful, e.g. when appending an
integer which can contain 0 in one of its bytes.

Removing this yields some small speedup, but more importantly will be
more consistent when providing faster variants of pq_sendint etc.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de
2017-10-11 16:01:52 -07:00
Tom Lane 5fa6b0d102 Remove unnecessary PG_TRY overhead for CurrentResourceOwner changes.
resowner/README contained advice to use a PG_TRY block to restore the
old CurrentResourceOwner value anywhere that that variable is transiently
changed.  That advice was only inconsistently followed, however, and
on reflection it seems like unnecessary overhead.  We don't bother
with such a convention for transient CurrentMemoryContext changes,
on the grounds that any (sub)transaction abort will start out by
resetting CurrentMemoryContext to what it wants.  But the same is
true of CurrentResourceOwner, so there seems no need to treat it
differently.

Hence, remove PG_TRY blocks that exist only to restore CurrentResourceOwner
before re-throwing the error.  There are a couple of places that restore
it along with some other actions, and I left those alone; the restore is
probably unnecessary but no noticeable gain will result from removing it.

Discussion: https://postgr.es/m/5236.1507583529@sss.pgh.pa.us
2017-10-11 17:44:09 -04:00
Andres Freund f676616651 Prevent idle in transaction session timeout from sometimes being ignored.
The previous coding in ProcessInterrupts() could lead to
idle_in_transaction_session_timeout being ignored, when
statement_timeout occurred earlier.

The problem was that ProcessInterrupts() would return before
processing the transaction timeout if QueryCancelPending was set while
QueryCancelHoldoffCount != 0 - which is the case when reading new
commands from the client. Ergo when the idle transaction timeout would
hit.

Fix that by removing the early return. Alternatively the transaction
timeout code could have been moved up, but that early return seems
like an issue that could hit other cases too.

Author: Lukas Fittl
Bug: #14821
Discussion:
    https://www.postgresql.org/message-id/20170921010956.17345.61461%40wrigleys.postgresql.org
    https://www.postgresql.org/message-id/CAP53PkxQnv3OWJpyNPGJYT62uY=n1=2CF_Lpc6gVOFnc0-gazw@mail.gmail.com
Backpatch: 9.6-, where idle_in_transaction_session_timeout was introduced.
2017-10-11 14:02:41 -07:00
Tom Lane 2860596832 Doc: fix missing explanation of default object privileges.
The GRANT reference page, which lists the default privileges for new
objects, failed to mention that USAGE is granted by default for data
types and domains.  As a lesser sin, it also did not specify anything
about the initial privileges for sequences, FDWs, foreign servers,
or large objects.  Fix that, and add a comment to acldefault() in the
probably vain hope of getting people to maintain this list in future.

Noted by Laurenz Albe, though I editorialized on the wording a bit.
Back-patch to all supported branches, since they all have this behavior.

Discussion: https://postgr.es/m/1507620895.4152.1.camel@cybertec.at
2017-10-11 16:57:14 -04:00
Robert Haas 20d210bf5b Fix mistakes in comments.
Masahiko Sawada

Discussion: http://postgr.es/m/CAD21AoBsfYsMHD6_SL9iN3n_Foaa+oPbL5jG55DxU1ChaujqwQ@mail.gmail.com
2017-10-11 15:55:10 -04:00
Tom Lane 118e99c3d7 Fix low-probability loss of NOTIFY messages due to XID wraparound.
Up to now async.c has used TransactionIdIsInProgress() to detect whether
a notify message's source transaction is still running.  However, that
function has a quick-exit path that reports that XIDs before RecentXmin
are no longer running.  If a listening backend is doing nothing but
listening, and not running any queries, there is nothing that will advance
its value of RecentXmin.  Once 2 billion transactions elapse, the
RecentXmin check causes active transactions to be reported as not running.
If they aren't committed yet according to CLOG, async.c decides they
aborted and discards their messages.  The timing for that is a bit tight
but it can happen when multiple backends are sending notifies concurrently.
The net symptom therefore is that a sufficiently-long-surviving
listen-only backend starts to miss some fraction of NOTIFY traffic,
but only under heavy load.

The only function that updates RecentXmin is GetSnapshotData().
A brute-force fix would therefore be to take a snapshot before
processing incoming notify messages.  But that would add cycles,
as well as contention for the ProcArrayLock.  We can be smarter:
having taken the snapshot, let's use that to check for running
XIDs, and not call TransactionIdIsInProgress() at all.  In this
way we reduce the number of ProcArrayLock acquisitions from one
per message to one per notify interrupt; that's the same under
light load but should be a benefit under heavy load.  Light testing
says that this change is a wash performance-wise for normal loads.

I looked around for other callers of TransactionIdIsInProgress()
that might be at similar risk, and didn't find any; all of them
are inside transactions that presumably have already taken a
snapshot.

Problem report and diagnosis by Marko Tiikkaja, patch by me.
Back-patch to all supported branches, since it's been like this
since 9.0.

Discussion: https://postgr.es/m/20170926182935.14128.65278@wrigleys.postgresql.org
2017-10-11 14:28:33 -04:00
Andres Freund fffd651e83 Rewrite strnlen replacement implementation from 8a241792f9.
The previous placement of the fallback implementation in libpgcommon
was problematic, because libpqport functions need strnlen
functionality.

Move replacement into libpgport. Provide strnlen() under its posix
name, instead of pg_strnlen(). Fix stupid configure bug, executing the
test only when compiled with threading support.

Author: Andres Freund
Discussion: https://postgr.es/m/E1e1gR2-0005fB-SI@gemulon.postgresql.org
2017-10-10 14:50:30 -07:00
Andres Freund 82c117cb90 Fix pnstrdup() to not memcpy() the maximum allowed length.
The previous behaviour was dangerous if the length passed wasn't the
size of the underlying buffer, but the maximum size of the underlying
buffer.

Author: Andres Freund
Discussion: https://postgr.es/m/20161003215524.mwz5p45pcverrkyk@alap3.anarazel.de
2017-10-09 15:20:42 -07:00
Andres Freund 84ad4b036d Reduce memory usage of targetlist SRFs.
Previously nodeProjectSet only released memory once per input tuple,
rather than once per returned tuple. If the computation of an
individual returned tuple requires a lot of memory, that can lead to
problems.

Instead change things so that the expression context can be reset once
per output tuple, which requires a new memory context to store SRF
arguments in.

This is a longstanding issue, but was hard to fix before 9.6, due to
the way tSRFs where evaluated. But it's fairly easy to fix now. We
could backpatch this into 10, but given there've been fewc omplaints
that doesn't seem worth the risk so far.

Reported-By: Lucas Fairchild
Author: Andres Freund, per discussion with Tom Lane
Discussion: https://postgr.es/m/4514.1507318623@sss.pgh.pa.us
2017-10-08 15:08:25 -07:00
Tom Lane 643c27e36f Increase distance between flush requests during bulk file copies.
copy_file() reads and writes data 64KB at a time (with default BLCKSZ),
and historically has issued a pg_flush_data request after each write.
This turns out to interact really badly with macOS's new APFS file
system: a large file copy takes over 100X longer than it ought to on
APFS, as reported by Brent Dearth.  While that's arguably a macOS bug,
it's not clear whether Apple will do anything about it in the near
future, and in any case experimentation suggests that issuing flushes
a bit less often can be helpful on other platforms too.

Hence, rearrange the logic in copy_file() so that flush requests are
issued once per N writes rather than every time through the loop.
I set the FLUSH_DISTANCE to 32MB on macOS (any less than that still
results in a noticeable speed degradation on APFS), but 1MB elsewhere.
In limited testing on Linux and FreeBSD, this seems slightly faster
than the previous code, and certainly no worse.  It helps noticeably
on macOS even with the older HFS filesystem.

A simpler change would have been to just increase the size of the
copy buffer without changing the loop logic, but that seems likely
to trash the processor cache without really helping much.

Back-patch to 9.6 where we introduced msync() as an implementation
option for pg_flush_data().  The problem seems specific to APFS's
mmap/msync support, so I don't think we need to go further back.

Discussion: https://postgr.es/m/CADkxhTNv-j2jw2g8H57deMeAbfRgYBoLmVuXkC=YCFBXRuCOww@mail.gmail.com
2017-10-08 15:25:26 -04:00
Tom Lane 8ec5429e2f Reduce "X = X" to "X IS NOT NULL", if it's easy to do so.
If the operator is a strict btree equality operator, and X isn't volatile,
then the clause must yield true for any non-null value of X, or null if X
is null.  At top level of a WHERE clause, we can ignore the distinction
between false and null results, so it's valid to simplify the clause to
"X IS NOT NULL".  This is a useful improvement mainly because we'll get
a far better selectivity estimate in most cases.

Because such cases seldom arise in well-written queries, it is unappetizing
to expend a lot of planner cycles looking for them ... but it turns out
that there's a place we can shoehorn this in practically for free, because
equivclass.c already has to detect and reject candidate equivalences of the
form X = X.  That doesn't catch every place that it would be valid to
simplify to X IS NOT NULL, but it catches the typical case.  Working harder
doesn't seem justified.

Patch by me, reviewed by Petr Jelinek

Discussion: https://postgr.es/m/CAMjNa7cC4X9YR-vAJS-jSYCajhRDvJQnN7m2sLH1wLh-_Z2bsw@mail.gmail.com
2017-10-08 12:23:32 -04:00
Tom Lane 1518d07842 Fix crash when logical decoding is invoked from a PL function.
The logical decoding functions do BeginInternalSubTransaction and
RollbackAndReleaseCurrentSubTransaction to clean up after themselves.
It turns out that AtEOSubXact_SPI has an unrecognized assumption that
we always need to cancel the active SPI operation in the SPI context
that surrounds the subtransaction (if there is one).  That's true
when the RollbackAndReleaseCurrentSubTransaction call is coming from
the SPI-using function itself, but not when it's happening inside
some unrelated function invoked by a SPI query.  In practice the
affected callers are the various PLs.

To fix, record the current subtransaction ID when we begin a SPI
operation, and clean up only if that ID is the subtransaction being
canceled.

Also, remove AtEOSubXact_SPI's assertion that it must have cleaned
up the surrounding SPI context's active tuptable.  That's proven
wrong by the same test case.

Also clarify (or, if you prefer, reinterpret) the calling conventions
for _SPI_begin_call and _SPI_end_call.  The memory context cleanup
in the latter means that these have always had the flavor of a matched
resource-management pair, but they weren't documented that way before.

Per report from Ben Chobot.

Back-patch to 9.4 where logical decoding came in.  In principle,
the SPI changes should go all the way back, since the problem dates
back to commit 7ec1c5a86.  But given the lack of field complaints
it seems few people are using internal subtransactions in this way.
So I don't feel a need to take any risks in 9.2/9.3.

Discussion: https://postgr.es/m/73FBA179-C68C-4540-9473-71E865408B15@silentmedia.com
2017-10-06 19:18:58 -04:00