Commit Graph

14051 Commits

Author SHA1 Message Date
Tom Lane d19bd29f07 Reset pg_stat_activity.xact_start during PREPARE TRANSACTION.
Once we've completed a PREPARE, our session is not running a transaction,
so its entry in pg_stat_activity should show xact_start as null, rather
than leaving the value as the start time of the now-prepared transaction.

I think possibly this oversight was triggered by faulty extrapolation
from the adjacent comment that says PrepareTransaction should not call
AtEOXact_PgStat, so tweak the wording of that comment.

Noted by Andres Freund while considering bug #10123 from Maxim Boguk,
although this error doesn't seem to explain that report.

Back-patch to all active branches.
2014-04-24 13:29:48 -04:00
Tom Lane f0fedfe82c Allow polymorphic aggregates to have non-polymorphic state data types.
Before 9.4, such an aggregate couldn't be declared, because its final
function would have to have polymorphic result type but no polymorphic
argument, which CREATE FUNCTION would quite properly reject.  The
ordered-set-aggregate patch found a workaround: allow the final function
to be declared as accepting additional dummy arguments that have types
matching the aggregate's regular input arguments.  However, we failed
to notice that this problem applies just as much to regular aggregates,
despite the fact that we had a built-in regular aggregate array_agg()
that was known to be undeclarable in SQL because its final function
had an illegal signature.  So what we should have done, and what this
patch does, is to decouple the extra-dummy-arguments behavior from
ordered-set aggregates and make it generally available for all aggregate
declarations.  We have to put this into 9.4 rather than waiting till
later because it slightly alters the rules for declaring ordered-set
aggregates.

The patch turned out a bit bigger than I'd hoped because it proved
necessary to record the extra-arguments option in a new pg_aggregate
column.  I'd thought we could just look at the final function's pronargs
at runtime, but that didn't work well for variadic final functions.
It's probably just as well though, because it simplifies life for pg_dump
to record the option explicitly.

While at it, fix array_agg() to have a valid final-function signature,
and add an opr_sanity test to notice future deviations from polymorphic
consistency.  I also marked the percentile_cont() aggregates as not
needing extra arguments, since they don't.
2014-04-23 19:17:41 -04:00
Heikki Linnakangas a4ad9afec2 Update obsolete comments.
We no longer have a TLI field in the page header.
2014-04-23 14:41:51 +03:00
Heikki Linnakangas 8fbfbf1472 Fix typos in comment. 2014-04-23 12:56:41 +03:00
Heikki Linnakangas 4fafc4ecd9 Cleanup of new b-tree page deletion code.
When marking a branch as half-dead, a pointer to the top of the branch is
stored in the leaf block's hi-key. During normal operation, the high key
was left in place, and the block number was just stored in the ctid field
of the high key tuple, but in WAL replay, the high key was recreated as a
truncated tuple with zero columns. For the sake of easier debugging, also
truncate the tuple in normal operation, so that the page is identical
after WAL replay. Also, rename the 'downlink' field in the WAL record to
'topparent', as that seems like a more descriptive name. And make sure
it's set to invalid when unlinking the leaf page.
2014-04-23 10:19:54 +03:00
Tom Lane d26b042ce5 Fix documentation of FmgrInfo.fn_nargs.
Some ancient comments claimed that fn_nargs could be -1 to indicate a
variable number of input arguments; but this was never implemented, and
is at variance with what we ultimately did with "variadic" functions.
Update the comments.
2014-04-22 23:22:12 -04:00
Tom Lane c6a4ace5bf Fix broken logic in logical_heap_rewrite_flush_mappings().
It's blatantly obvious that commit 4d0d607a45
wasn't tested.  The leak's real enough, though.
2014-04-22 22:33:35 -04:00
Bruce Momjian cee850c403 revert 4d0d607a45
Revert due to contrib/test_decoding regression failure
2014-04-22 22:21:54 -04:00
Bruce Momjian 4d0d607a45 release memory used while flushing logical mappings
Patch by Ants Aasma
2014-04-22 18:05:44 -04:00
Heikki Linnakangas 4a5d55ec2b Fix bug in the new B-tree incomplete-split code.
Forgot to update LSN of left sibling's page, when creating a new root.
I fixed this for regular insertions and page splits earlier, but missed
new root creation.
2014-04-22 22:40:44 +03:00
Heikki Linnakangas 45e67a2ad7 Fix Gin README.
The README incorrectly claimed that GIN posting tree pages contain an array
of uncompressed items in addition to compressed posting lists. Earlier
versions of the GIN posting list compression patch worked that way, but not
the one that was committed.
2014-04-22 22:39:50 +03:00
Heikki Linnakangas 77fe2b6d79 Fix bug in new B-tree page deletion code.
When modifying a page, must hold an exclusive lock. A shared lock is
obviously not good enough.
2014-04-22 15:34:54 +03:00
Heikki Linnakangas 7e30c186da Retain original physical order of tuples in redo of b-tree splits.
It makes no difference to the system, but minimizing the differences
between a master and standby makes debugging simpler.
2014-04-22 13:03:37 +03:00
Heikki Linnakangas 7d98054f0d Fix rm_desc routine of b-tree page delete records.
A couple of typos from my refactoring of the page deletion patch.
2014-04-22 13:02:52 +03:00
Heikki Linnakangas 8d34f68628 Avoid transient bogus page contents when creating a sequence.
Don't use simple_heap_insert to insert the tuple to a sequence relation.
simple_heap_insert creates a heap insertion WAL record, and replaying that
will create a regular heap page without the special area containing the
sequence magic constant, which is wrong for a sequence. That was not a bug
because we always created a sequence WAL record after that, and replaying
that overwrote the bogus heap page, and the transient state could never be
seen by another backend because it was only done when creating a new
sequence relation. But it's simpler and cleaner to avoid that in the first
place.
2014-04-22 10:40:23 +03:00
Robert Haas fab6170cab Fix typo.
Etsuro Fujita
2014-04-20 16:30:55 +02:00
Magnus Hagander 66b1084e2c Fix typo
Amit Langote
2014-04-18 12:49:54 +02:00
Bruce Momjian 83defef8c7 report stat() error in trigger file check
Permissions might prevent the existence of the trigger file from being
checked.

Per report from Andres Freund
2014-04-17 11:55:57 -04:00
Heikki Linnakangas 2a8e1ac598 Set the all-visible flag on heap page before writing WAL record, not after.
If we set the all-visible flag after writing WAL record, and XLogInsert
takes a full-page image of the page, the image would not include the flag.
We will then proceed to set the VM bit, which would then be set without the
corresponding all-visible flag on the heap page.

Found by comparing page images on master and standby, after writing/replaying
each WAL record. (There is still a discrepancy: the all-visible flag won't
be set after replaying the HEAP_CLEAN record, even though it is set in the
master. However, it will be set when replaying the HEAP2_VISIBLE record and
setting the VM bit, so the all-visible flag and VM bit are always consistent
on the standby, even though they are momentarily out-of-sync with master)

Backpatch to 9.3 where this code was introduced.
2014-04-17 17:47:50 +03:00
Tom Lane 5f86cbd714 Rename EXPLAIN ANALYZE's "total runtime" output to "execution time".
Now that EXPLAIN also outputs a "planning time" measurement, the use of
"total" here seems rather confusing: it sounds like it might include the
planning time which of course it doesn't.  Majority opinion was that
"execution time" is a better label, so we'll call it that.

This should be noted as a backwards incompatibility for tools that examine
EXPLAIN ANALYZE output.

In passing, I failed to resist the temptation to do a little editing on the
materialized-view example affected by this change.
2014-04-16 20:48:59 -04:00
Alvaro Herrera 83ab8e32f2 Fix object identities for text search objects
We were neglecting to schema-qualify them.

Backpatch to 9.3, where object identities were introduced as a concept
by commit f8348ea32e.
2014-04-16 18:25:44 -03:00
Tom Lane cad4fe6455 Use AF_UNSPEC not PF_UNSPEC in getaddrinfo calls.
According to the Single Unix Spec and assorted man pages, you're supposed
to use the constants named AF_xxx when setting ai_family for a getaddrinfo
call.  In a few places we were using PF_xxx instead.  Use of PF_xxx
appears to be an ancient BSD convention that was not adopted by later
standardization.  On BSD and most later Unixen, it doesn't matter much
because those constants have equivalent values anyway; but nonetheless
this code is not per spec.

In the same vein, replace PF_INET by AF_INET in one socket() call, which
wasn't even consistent with the other socket() call in the same function
let alone the remainder of our code.

Per investigation of a Cygwin trouble report from Marco Atzeri.  It's
probably a long shot that this will fix his issue, but it's wrong in
any case.
2014-04-16 13:21:20 -04:00
Robert Haas dfc0219f64 Add to_regprocedure() and to_regoperator().
These are natural complements to the functions added by commit
0886fc6a5c, but they weren't included
in the original patch for some reason.  Add them.

Patch by me, per a complaint by Tom Lane.  Review by Tatsuo
Ishii.
2014-04-16 12:21:43 -04:00
Robert Haas 1a81daab8b Try to fix spurious DSM failures on Windows.
Apparently, Windows can sometimes return an error code even when the
operation actually worked just fine.  Rearrange the order of checks
according to what appear to be the best practices in this area.

Amit Kapila
2014-04-16 12:04:44 -04:00
Bruce Momjian 4180934651 check socket creation errors against PGINVALID_SOCKET
Previously, in some places, socket creation errors were checked for
negative values, which is not true for Windows because sockets are
unsigned.  This masked socket creation errors on Windows.

Backpatch through 9.0.  8.4 doesn't have the infrastructure to fix this.
2014-04-16 10:45:48 -04:00
Heikki Linnakangas 848b9f05ab Use correctly-sized buffer when zero-filling a WAL file.
I mixed up BLCKSZ and XLOG_BLCKSZ when I changed the way the buffer is
allocated a couple of weeks ago. With the default settings, they are both
8k, but they can be changed at compile-time.
2014-04-16 10:26:36 +03:00
Heikki Linnakangas f1dadd34fa Set pd_lower on internal GIN posting tree pages.
This allows squeezing out the unused space in full-page writes. And more
importantly, it can be a useful debugging aid.

In hindsight we should've done this back when GIN was added - we wouldn't
need the 'maxoff' field in the page opaque struct if we had used pd_lower
and pd_upper like on normal pages. But as long as there can be pages in the
index that have been binary-upgraded from pre-9.4 versions, we can't rely
on that, and have to continue using 'maxoff'.

Most of the code churn comes from renaming some macros, now that they're
used on internal pages, too.

This change is completely backwards-compatible, no effect on pg_upgrade.
2014-04-14 21:13:19 +03:00
Tom Lane 4dfb065b3a Fix bogus handling of bad strategy number in GIST consistent() functions.
Make sure we throw an error instead of silently doing the wrong thing when
fed a strategy number we don't recognize.  Also, in the places that did
already throw an error, spell the error message in a way more consistent
with our message style guidelines.

Per report from Paul Jones.  Although this is a bug, it won't occur unless
a superuser tries to do something he shouldn't, so it doesn't seem worth
back-patching.
2014-04-14 11:18:47 -04:00
Heikki Linnakangas e3e6e3af56 Remove dead checks for invalid left page in ginDeletePage.
In some places, the function assumes the left page is valid, and in others,
it checks if it is valid. Remove all the checks.
2014-04-14 15:27:32 +03:00
Heikki Linnakangas 1bd3842163 GIN entry pages follow the standard page layout - tell XLogInsert.
The entry B-tree pages all follow the standard page layout. The 9.3 code has
this right. I inadvertently changed this at some point during the big
refactorings in git master.
2014-04-14 14:51:28 +03:00
Tom Lane e0c91a7ff0 Improve some O(N^2) behavior in window function evaluation.
Repositioning the tuplestore seek pointer in window_gettupleslot() turns
out to be a very significant expense when the window frame is sizable and
the frame end can move.  To fix, introduce a tuplestore function for
skipping an arbitrary number of tuples in one call, parallel to the one we
introduced for tuplesort objects in commit 8d65da1f.  This reduces the cost
of window_gettupleslot() to O(1) if the tuplestore has not spilled to disk.
As in the previous commit, I didn't try to do any real optimization of
tuplestore_skiptuples for the case where the tuplestore has spilled to
disk.  There is probably no practical way to get the cost to less than O(N)
anyway, but perhaps someone can think of something later.

Also fix PersistHoldablePortal() to make use of this API now that we have
it.

Based on a suggestion by Dean Rasheed, though this turns out not to look
much like his patch.
2014-04-13 13:59:17 -04:00
Stephen Frost 5f508b6dea Make a dedicated AlterTblSpcStmt production
Given that ALTER TABLESPACE has moved on from just existing for
general purpose rename/owner changes, it deserves its own top-level
production in the grammar.  This also cleans up the RenameStmt to
only ever be used for actual RENAMEs again- it really wasn't
appropriate to hide non-RENAME productions under there.

Noted by Alvaro.
2014-04-13 01:02:44 -04:00
Tom Lane d95425c8b9 Provide moving-aggregate support for boolean aggregates.
David Rowley and Florian Pflug, reviewed by Dean Rasheed
2014-04-13 00:01:46 -04:00
Stephen Frost 842faa714c Make security barrier views automatically updatable
Views which are marked as security_barrier must have their quals
applied before any user-defined quals are called, to prevent
user-defined functions from being able to see rows which the
security barrier view is intended to prevent them from seeing.

Remove the restriction on security barrier views being automatically
updatable by adding a new securityQuals list to the RTE structure
which keeps track of the quals from security barrier views at each
level, independently of the user-supplied quals.  When RTEs are
later discovered which have securityQuals populated, they are turned
into subquery RTEs which are marked as security_barrier to prevent
any user-supplied quals being pushed down (modulo LEAKPROOF quals).

Dean Rasheed, reviewed by Craig Ringer, Simon Riggs, KaiGai Kohei
2014-04-12 21:04:58 -04:00
Tom Lane 9d229f399e Provide moving-aggregate support for a bunch of numerical aggregates.
First installment of the promised moving-aggregate support in built-in
aggregates: count(), sum(), avg(), stddev() and variance() for
assorted datatypes, though not for float4/float8.

In passing, remove a 2001-vintage kluge in interval_accum(): interval
array elements have been properly aligned since around 2003, but
nobody remembered to take out this workaround.  Also, fix a thinko
in the opr_sanity tests for moving-aggregate catalog entries.

David Rowley and Florian Pflug, reviewed by Dean Rasheed
2014-04-12 20:33:09 -04:00
Tom Lane a9d9acbf21 Create infrastructure for moving-aggregate optimization.
Until now, when executing an aggregate function as a window function
within a window with moving frame start (that is, any frame start mode
except UNBOUNDED PRECEDING), we had to recalculate the aggregate from
scratch each time the frame head moved.  This patch allows an aggregate
definition to include an alternate "moving aggregate" implementation
that includes an inverse transition function for removing rows from
the aggregate's running state.  As long as this can be done successfully,
runtime is proportional to the total number of input rows, rather than
to the number of input rows times the average frame length.

This commit includes the core infrastructure, documentation, and regression
tests using user-defined aggregates.  Follow-on commits will update some
of the built-in aggregates to use this feature.

David Rowley and Florian Pflug, reviewed by Dean Rasheed; additional
hacking by me
2014-04-12 12:03:30 -04:00
Heikki Linnakangas 614167c6d7 Fix bugs in GIN "fast scan" with partial match.
There were a couple of bugs here. First, if the fuzzy limit was exceeded,
the loop in entryGetItem might drop out too soon if a whole block needs to
be skipped because it's < advancePast ("continue" in a while-loop checks the
loop condition too). Secondly, the loop checked when stepping to a new page
that there is at least one offset on the page < advancePast, but we cannot
rely on that on subsequent calls of entryGetItem, because advancePast might
change in between. That caused the skipping loop to read bogus items in the
TbmIterateResult's offset array.

First item and fix by Alexander Korotkov, second bug pointed out by Fabrízio
de Royes Mello, by a small variation of Alexander's test query.
2014-04-10 23:42:04 +03:00
Bruce Momjian 8fcccadfea C comment: track_activity_query_size doesn't support memory units
And explain why.

Per report from Pavel Stehule
2014-04-10 09:57:04 -04:00
Heikki Linnakangas 787064cd00 Fix typo in comment.
Tomonari Katsumata
2014-04-10 13:11:49 +03:00
Heikki Linnakangas 150a9df528 Fix a few more misc typos in comments. 2014-04-10 00:53:55 +03:00
Heikki Linnakangas 5b075ae893 Fix misc typos in comments. 2014-04-09 23:16:35 +03:00
Robert Haas b082732061 Add missing include.
This is more cleanup from commit 11a65eed16.

Amit Kapila
2014-04-09 11:46:49 -04:00
Robert Haas 0c4ea7a309 Fix silly oversight in patch to remove dsm state file.
I'm not sure if this is what's causing the Windows buildfarm members
to get unhappy, but I don't think it can be helping anything...
2014-04-08 16:22:50 -04:00
Tom Lane f23a5630eb Add an in-core GiST index opclass for inet/cidr types.
This operator class can accelerate subnet/supernet tests as well as
btree-equivalent ordered comparisons.  It also handles a new network
operator inet && inet (overlaps, a/k/a "is supernet or subnet of"),
which is expected to be useful in exclusion constraints.

Ideally this opclass would be the default for GiST with inet/cidr data,
but we can't mark it that way until we figure out how to do a more or
less graceful transition from the current situation, in which the
really-completely-bogus inet/cidr opclasses in contrib/btree_gist are
marked as default.  Having the opclass in core and not default is better
than not having it at all, though.

While at it, add new documentation sections to allow us to officially
document GiST/GIN/SP-GiST opclasses, something there was never a clear
place to do before.  I filled these in with some simple tables listing
the existing opclasses and the operators they support, but there's
certainly scope to put more information there.

Emre Hasegeli, reviewed by Andreas Karlsson, further hacking by me
2014-04-08 15:46:43 -04:00
Robert Haas 11a65eed16 Get rid of the dynamic shared memory state file.
Instead of storing the ID of the dynamic shared memory control
segment in a file within the data directory, store it in the main
control segment.  This avoids a number of nasty corner cases,
most seriously that doing an online backup and then using it on
the same machine (e.g. to fire up a standby) would result in the
standby clobbering all of the master's dynamic shared memory
segments.

Per complaints from Heikki Linnakangas, Fujii Masao, and Tom
Lane.
2014-04-08 11:39:55 -04:00
Robert Haas 0886fc6a5c Add new to_reg* functions for error-free OID lookups.
These functions won't throw an error if the object doesn't exist,
or if (for functions and operators) there's more than one matching
object.

Yugo Nagata and Nozomi Anzai, reviewed by Amit Khandekar, Marti
Raudsepp, Amit Kapila, and me.
2014-04-08 10:27:56 -04:00
Heikki Linnakangas 7ca32e255b Fix hot standby bug with GiST scans.
Don't reset the rightlink of a page when replaying a page update record.
This was a leftover from pre-hot standby days, when it was not possible to
have scans concurrent with WAL replay. Resetting the right-link was not
necessary back then either, but it was done for the sake of tidiness. But
with hot standby, it's wrong, because a concurrent scan might still need it.

Backpatch all versions with hot standby, 9.0 and above.
2014-04-08 14:51:40 +03:00
Heikki Linnakangas 38a2b95c34 Zero padding byte at end of GIN posting list.
This isn't strictly necessary, but helps debugging.
2014-04-07 19:49:03 +03:00
Robert Haas f235db03ff Remove 'make clean' support for ipc_test.
I missed this in the previous commit; Tom Lane spotted my error.
2014-04-07 11:45:27 -04:00
Robert Haas 315772e4ec Assert that strong-lock count is >0 everywhere it's decremented.
The one existing assertion of this type has tripped a few times in the
buildfarm lately, but it's not clear whether the problem is really
originating there or whether it's leftovers from a trip through one
of the other two paths that lack a matching assertion.  So add one.

Since the same bug(s) most likely exist(s) in the back-branches also,
back-patch to 9.2, where the fast-path lock mechanism was added.
2014-04-07 10:59:42 -04:00