so that user-defined window functions are possible. For the moment you'll
have to write them in C, for lack of any interface to the WindowObject API
in the available PLs, but it's better than no support at all.
There was some debate about the best syntax for this. I ended up choosing
the "it's an attribute" position --- the other approach will inevitably be
more work, and the likely market for user-defined window functions is
probably too small to justify it.
patch. This includes the ability to force the frame to cover the whole
partition, and the ability to make the frame end exactly on the current row
rather than its last ORDER BY peer. Supporting any more of the full SQL
frame-clause syntax will require nontrivial hacking on the window aggregate
code, so it'll have to wait for 8.5 or beyond.
field needs to be included in equalRuleLocks() comparisons, else updates
will fail to propagate into relcache entries when they have positive
reference count (ie someone is using the relcache entry).
Per report from Alex Hunsaker.
upcoming window-functions patch. First, tuplestore_trim is now an
exported function that must be explicitly invoked by callers at
appropriate times, rather than something that tuplestore tries to do
behind the scenes. Second, a read pointer that is marked as allowing
backward scan no longer prevents truncation. This means that a read pointer
marked as having BACKWARD but not REWIND capability can only safely read
backwards as far as the oldest other read pointer. (The expected use pattern
for this involves having another read pointer that serves as the truncation
fencepost.)
This doesn't do any remote or external things yet, but it gives modules
like plproxy and dblink a standardized and future-proof system for
managing their connection information.
Martin Pihlak and Peter Eisentraut
explicit cast to show the intended array type, we forgot to teach ruleutils.c
to print out such constructs properly. Found by noting bogus output from
recent changes in polymorphism regression test.
materialize-mode set results. Since it now uses the ReturnSetInfo node
to hold internal state, we need to be sure to set up the node even when
the immediately called function doesn't return set (but does have a set-valued
argument). Per report from Anupama Aherrao.
skipped. We could update relpages anyway, but it seems better to only
update it together with reltuples, because we use the reltuples/relpages
ratio in the planner. Also don't update n_live_tuples in pgstat.
ANALYZE in VACUUM ANALYZE now needs to update pg_class, if the
VACUUM-phase didn't do so. Added some boolean-passing to let analyze_rel
know if it should update pg_class or not.
I also moved the relcache invalidation (to update rd_targblock) from
vac_update_relstats to where RelationTruncate is called, because
vac_update_relstats is not called for partial vacuums anymore. It's more
obvious to send the invalidation close to the truncation that requires it.
Per report by Ned T. Crigler.
includes a few new ones.
- Fixed compilation errors on OS X for probes that use typedefs
- Fixed a number of probes to pass ForkNumber per the relation forks
patch
- The new probes are those that were taken out from the previous
submitted patch and required simple fixes. Will submit the other probes
that may require more discussion in a separate patch.
Robert Lor
the other major heapam.c functions. The only known consequence of this
omission is that UPDATE RETURNING failed to return the correct value for
"tableoid", as per report from KaiGai Kohei.
Back-patch to 8.2. Arguably it's wrong all the way back; but without
evidence of visible breakage before RETURNING was added, I'll desist from
patching the older branches.
to return NULL, instead of erroring out, if the target object is specified by
OID and we can't find that OID in the catalogs. Since these functions operate
internally on SnapshotNow rules, there is a race condition when using them
in user queries: the query's MVCC snapshot might "see" a catalog row that's
already committed dead, leading to a failure when the inquiry function is
applied. Returning NULL should generally provide more convenient behavior.
This issue has been complained of before, and in particular we are now seeing
it in the regression tests due to another recent patch.
to 10, to compensate for the recent change in default statistics target.
The original number was pulled out of the air anyway :-(, but it was picked
in the context of the old default, so holding the default size of the
MCELEM array constant seems the best thing. Per discussion.
pg_database_encoding_max_length() predicts the maximum character length
returned by wchar2char(). Per Hiroshi Inoue, MB_CUR_MAX isn't usable on
Windows because we allow encoding = UTF8 when the locale says differently;
and getting rid of it seems a good idea on general principles because it
narrows our dependence on libc's locale API just a little bit more.
Also install a check for overflow of the buffer size computation.
actual argument type of ANYARRAY to match an argument declared ANYARRAY,
so long as ANYELEMENT etc aren't used. I had overlooked the fact that this
is a possible case while fixing bug #3852; but it is possible because
pg_statistic contains columns declared ANYARRAY. Per gripe from Corey Horton.
when they are invoked by the parser. We had been setting up a snapshot at
plan time but really it needs to be done earlier, before parse analysis.
Per report from Dmitry Koterov.
Also fix two related problems discovered while poking at this one:
exec_bind_message called datatype input functions without establishing a
snapshot, and SET CONSTRAINTS IMMEDIATE could call trigger functions without
establishing a snapshot.
Backpatch to 8.2. The underlying problem goes much further back, but it is
masked in 8.1 and before because we didn't attempt to invoke domain check
constraints within datatype input. It would only be exposed if a C-language
datatype input function used the snapshot; which evidently none do, or we'd
have heard complaints sooner. Since this code has changed a lot over time,
a back-patch is hardly risk-free, and so I'm disinclined to patch further
than absolutely necessary.
vacuuming (it's not), say "database-wide VACUUM" instead of "full-database
VACUUM" in the relevant hint messages. Also, document the permissions needed
to do this. Per today's discussion.
right child if it doesn't need to. This saves some miniscule number
of cycles, but the ulterior motive is to avoid an optimization bug
known to exist in SCO's C compiler (and perhaps others?)
replication patch needs a signal, but we've already used SIGUSR1 and
SIGUSR2 in normal backends. This patch allows reusing SIGUSR1 for that,
and for other purposes too if the need arises.
where no function stats entries exist. Partial response to Pavel's
observation that small VACUUM operations are noticeably slower in CVS HEAD
than 8.3.
form a join and that case doesn't have anything to join to. (We could
probably make it work if we didn't pull up the subquery, but it seems to
me that the case isn't worth extra code.) Per report from Greg Stark.
appendix on key words. catdesc was originally intended as computer-readable,
but since we ended up adding catcode, we can have more elaborate descriptions.
non-writable large objects need to have their snapshots registered on the
transaction resowner, not the current portal's, because it must persist until
the large object is closed (which the portal does not). Also, ensure that the
serializable snapshot is recorded by the transaction resource owner too, even
when a subtransaction has changed the current resource owner before
serializable is taken.
Per bug reports from Pavan Deolasee.
the visibility map patch that because autovacuum always sets
VacuumStmt->freeze_min_age, visibility map was never used for autovacuum,
only for manually launched vacuums. This patch introduces a new scan_all
field to VacuumStmt, indicating explicitly whether the visibility map
should be used, or the whole relation should be scanned, to advance
relfrozenxid. Anti-wraparound vacuums still need to scan all pages.
it's connection. This is required for applications that unload
the libpq library (such as PHP) in which case we'd otherwise
have pointers to these functions when they no longer exist.
This needs a bit more testing before we can consider a backpatch,
so not doing that yet.
In passing, remove unused functions in backend/libpq.
Bruce Momjian and Magnus Hagander, per report and analysis
by Russell Smith.
heap page, where a set bit indicates that all tuples on the page are
visible to all transactions, and the page therefore doesn't need
vacuuming. It is stored in a new relation fork.
Lazy vacuum uses the visibility map to skip pages that don't need
vacuuming. Vacuum is also responsible for setting the bits in the map.
In the future, this can hopefully be used to implement index-only-scans,
but we can't currently guarantee that the visibility map is always 100%
up-to-date.
In addition to the visibility map, there's a new PD_ALL_VISIBLE flag on
each heap page, also indicating that all tuples on the page are visible to
all transactions. It's important that this flag is kept up-to-date. It
is also used to skip visibility tests in sequential scans, which gives a
small performance gain on seqscans.
outer join clauses. Given, say,
... from a left join b on a.a1 = b.b1 where a.a1 = 42;
we'll deduce a clause b.b1 = 42 and then mark the original join clause
redundant (we can't remove it completely for reasons I don't feel like
squeezing into this log entry). However the original implementation of
that wasn't bulletproof, because clause_selectivity() wouldn't honor
this_selec if given nonzero varRelid --- which in practice meant that
it worked as desired *except* when considering index scan quals. Which
resulted in bogus underestimation of the size of the indexscan result for
an inner indexscan in an outer join, and consequently a possibly bad
choice of indexscan vs. bitmap scan. Fix by introducing an explicit test
into clause_selectivity(). Also, to make sure we don't trigger that test
in corner cases, change the convention to be that this_selec > 1, not
this_selec = 1, means it's been marked redundant. Per trouble report from
Scara Maccai.
Back-patch to 8.2, where the problem was introduced.
is treated like a non-digit separator. This fixes the inconsistency in
examples like:
to_timestamp('2008-01-2', 'YYYY-MM-DD') -- didn't work
and
to_timestamp('2008-1-02', 'YYYY-MM-DD') -- did work
toasted values, since those could get dropped once the cursor's transaction
is over. Per bug #4553 from Andrew Gierth.
Back-patch as far as 8.1. The bug actually exists back to 7.4 when holdable
cursors were introduced, but this patch won't work before 8.1 without
significant adjustments. Given the lack of field complaints, it doesn't seem
worth the work (and risk of introducing new bugs) to try to make a patch for
the older branches.
that a Portal is a useful and sufficient additional argument for
CreateDestReceiver --- it just isn't, in most cases. Instead formalize
the approach of passing any needed parameters to the receiver separately.
One unexpected benefit of this change is that we can declare typedef Portal
in a less surprising location.
This patch is just code rearrangement and doesn't change any functionality.
I'll tackle the HOLD-cursor-vs-toast problem in a follow-on patch.
the basic representational details (typlen, typalign, typbyval, typstorage)
to be copied from an existing type rather than listed explicitly in the
CREATE TYPE command. The immediate reason for this is to provide a simple
solution for add-on modules that want to define types represented as int8,
float4, or float8: as of 8.4 the appropriate PASSEDBYVALUE setting is
platform-specific and so it's hard for a SQL script to know what to do.
This patch fixes the contrib/isn breakage reported by Rushabh Lathia.
This was a thinko introduced in a patch from last February; it results
in memory leakage if an SRF is shut down before the actual end of query,
because subsequent code will be running in a longer-lived context than
it's expecting to be.
RHS that can't be unique-ified --- join_is_legal has to check that before
deciding to build a join, else we'll have an unimplementable joinrel.
Per report from Greg Stark.
DestReceiver created during postquel_start needs to be destroyed during
postquel_end. In a moment of brain fade I had assumed this would be taken
care of by FreeQueryDesc, but it's not (and shouldn't be).
by hand. As an added bonus, the new code is smaller and more understandable,
and the ugly loops are gone.
This had been discussed all along but never implemented. It became clear that
it really needed to be fixed after a bug report by Pavan Deolasee.
the bgwriter immediately. This covers the case where the bgwriter is still
starting up, as seen in a recent buildfarm failure. In future it might also
assist with clean recovery after a bgwriter termination and restart ---
right now the postmaster treats early bgwriter exit as a system crash,
but that might not always be so.
though it is an inner rather than outer join type. This essentially means
that we don't bother to separate "pushed down" qual conditions from actual
join quals at a semijoin plan node; which is okay because the restrictions of
SQL syntax make it impossible to have a pushed-down qual that references the
inner side of a semijoin. This allows noticeably better optimization of
IN/EXISTS cases than we had before, since the equivalence-class machinery can
now use those quals. Also fix a couple of other mistakes that had essentially
disabled the ability to unique-ify the inner relation and then join it to just
a subset of the left-hand relations. An example case using the regression
database is
select * from tenk1 a, tenk1 b
where (a.unique1,b.unique2) in (select unique1,unique2 from tenk1 c);
which is planned reasonably well by 8.3 and earlier but had been forcing a
cartesian join of a/b in CVS HEAD.
as LIKE. I oversimplified this code when removing support for plan-time
determination of index operator lossiness back in April --- I had thought
create_bitmap_subplan could stop returning two separate lists of qual
conditions, but it still must so that we can treat special operators
correctly in create_bitmap_scan_plan. Per report from Rushabh Lathia.
truncations in FSM code, call FreeSpaceMapTruncateRel from smgr_redo. To
make that cleaner from modularity point of view, move the WAL-logging one
level up to RelationTruncate, and move RelationTruncate and all the
related WAL-logging to new src/backend/catalog/storage.c file. Introduce
new RelationCreateStorage and RelationDropStorage functions that are used
instead of calling smgrcreate/smgrscheduleunlink directly. Move the
pending rel deletion stuff from smgrcreate/smgrscheduleunlink to the new
functions. This leaves smgr.c as a thin wrapper around md.c; all the
transactional stuff is now in storage.c.
This will make it easier to add new forks with similar truncation logic,
like the visibility map.
* Refactor explain.c slightly to export a convenient-to-use subroutine
for printing EXPLAIN results.
* Provide hooks for plugins to get control at ExecutorStart and ExecutorEnd
as well as ExecutorRun.
* Add some minimal support for tracking the total runtime of ExecutorRun.
This code won't actually do anything unless a plugin prods it to.
* Change the API of the DefineCustomXXXVariable functions to allow nonzero
"flags" to be specified for a custom GUC variable. While at it, also make
the "bootstrap" default value for custom GUCs be explicitly specified as a
parameter to these functions. This is to eliminate confusion over where the
default comes from, as has been expressed in the past by some users of the
custom-variable facility.
* Refactor GUC code a bit to ensure that a custom variable gets initialized to
something valid (like its default value) even if the placeholder value was
invalid.
locate the target row, if the cursor was declared with FOR UPDATE or FOR
SHARE. This approach is more flexible and reliable than digging through the
plan tree; for instance it can cope with join cursors. But we still provide
the old code for use with non-FOR-UPDATE cursors. Per gripe from Robert Haas.
return the tableoid as well as the ctid for any FOR UPDATE targets that
have child tables. All child tables are listed in the ExecRowMark list,
but the executor just skips the ones that didn't produce the current row.
Curiously, this longstanding restriction doesn't seem to have been documented
anywhere; so no doc changes.
if the user is superuser. This makes available to extension modules the same
sort of trick being practiced by array_agg(). The reason for the superuser
restriction is that you could crash the system by connecting up an
incompatible pair of internal-using functions as an aggregate. It shouldn't
interfere with any legitimate use, since you'd have to be superuser to create
the internal-using transition and final functions anyway.
returns VOID. This is the last of the easy fixes I recommended in
11870.1218838360@sss.pgh.pa.us --- the others got done awhile ago but
I forgot about this one.
heap_form_tuple. Since this removes the last remaining caller of
heap_addheader, remove it.
Extracted from the column privileges patch from Stephen Frost, with further
code cleanups by me.
anyelement. This lacks the WITH ORDINALITY option, as well as the multiple
input arrays option added in the most recent SQL specs. But it's still a
pretty useful subset of the spec's functionality, and it is enough to
allow obsoleting contrib/intagg.
for inserting tuples in increasing TID order. It's not clear whether this
fully explains Ivan Sergio Borgonovo's complaint, but simple testing
confirms that a scan that doesn't start at block 0 can slow GIN build by
a factor of three or four.
Backpatch to 8.3. Sync scan didn't exist before that.
- FloatOnly: only used by NumericOnly, instead put the FloatOnly production into NumericOnly
- IntegerOnly: only used by NumericOnly and one ALTER TABLE rule, replacement SignedIconst is already used in several other places
operator. The result depends only on the two input operators and the proof
direction (imply or refute), so it's easy to cache. This provides a very
large savings in cases such as Sergey Konoplev's long NOT-IN-list example,
where predtest spends all its time repeatedly figuring out that the same pair
of operators cannot be used to prove anything. (But of course the O(N^2)
behavior still catches up with you eventually.) I'm not convinced it buys
a whole lot when constraint_exclusion isn't turned on, but it's not a lot
of added code so we might as well cache all the time.
AND, OR, or equivalent clauses: if there are too many (more than 100) just
exit without proving anything. This ensures that we don't spend O(N^2) time
trying (and most likely failing) to prove anything about very long IN lists
and similar cases.
Also, install a couple of CHECK_FOR_INTERRUPTS calls to ensure that a long
proof attempt can be interrupted.
Per gripe from Sergey Konoplev.
Back-patch the whole patch to 8.2 and just the CHECK_FOR_INTERRUPTS addition
to 8.1. (The rest of the patch doesn't apply cleanly, and since 8.1 doesn't
show the complained-of behavior anyway, it doesn't seem necessary to work
hard on it.)
function as a special case.
This version still has the suspicious behavior of returning null for an
empty array (rather than zero), but this may need a wholesale revision of
empty array behavior, currently under discussion.
Jim Nasby, Robert Haas, Peter Eisentraut
in "postgres_verbose" intervalstyle, and the equally arbitrary decision to
show at least two fractional-seconds digits in most other datetime display
styles. This results in some minor changes in the expected regression test
outputs.
Also, coalesce a lot of repetitive code in datetime.c into subroutines,
for clarity and ease of maintenance. In particular this roughly halves
the number of #ifdef HAVE_INT64_TIMESTAMP segments.
Ron Mayer, with some additional kibitzing from Tom Lane
translated_vars list get updated when pulling up an appendrel member. It's
not clear that this really matters at present, since relatively little gets
done with the outputs of an appendrel child relation; but it probably will
come back to bite us sometime if we leave them with the wrong values.
we extended the appendrel mechanism to support UNION ALL optimization. The
reason nobody noticed was that we are not actually using attr_needed data for
appendrel children; hence it seems more reasonable to rip it out than fix it.
Back-patch to 8.2 because an Assert failure is possible in corner cases.
Per examination of an example from Jim Nasby.
In HEAD, also get rid of AppendRelInfo.col_mappings, which is quite inadequate
to represent UNION ALL situations; depend entirely on translated_vars instead.
"base/11517/3767_fsm", instead of symbolic names like "1663/11517/3767/1",
per Alvaro's suggestion. I didn't change the messages in the higher-level
index, heap and FSM routines, though, where the fork is implicit.
specifically, we can input either the "format with designators" or the
"alternative format", and we can output the former when IntervalStyle is set
to iso_8601.
Ron Mayer
the length of a UTF8 character with pg_mblen (wrong if DB encoding isn't
UTF8), and the latter was blithely assuming that a static buffer would somehow
revert to all zeroes for each use.
("there might be triggers") rather than an exact count. This is necessary
catalog infrastructure for the upcoming patch to reduce the strength of
locking needed for trigger addition/removal. Split out and committed
separately for ease of reviewing/testing.
In passing, also get rid of the unused pg_class columns relukeys, relfkeys,
and relrefs, which haven't been maintained in many years and now have no
chance of ever being maintained (because of wishing to avoid locking).
Simon Riggs
from DateStyle, and create a new interval style that produces output matching
the SQL standard (at least for interval values that fall within the standard's
restrictions). IntervalStyle is also used to resolve the conflict between the
standard and traditional Postgres rules for interpreting negative interval
input.
Ron Mayer
(but not locked, as that would risk deadlocks). Also, make it work in a small
ring of buffers to avoid having bulk inserts trash the whole buffer arena.
Robert Haas, after an idea of Simon Riggs'.
data type. This patch takes the approach of allowing an optional hyphen after
each group of four hex digits.
Author: Robert Haas <robertmhaas@gmail.com>
from COMMITTED to SUBCOMMITTED during recovery. This wasn't previously
possible, but it is now due to the recent changes on clog commit protocol for
subtransactions.
Simon Riggs
upon requests from backends, rather than on a fixed 500msec cycle. (There's
still throttling logic to ensure it writes no more often than once per
500msec, though.) This should result in a significant reduction in stats file
write traffic in typical scenarios where the stats are demanded only
infrequently.
This approach also means that the former difficulty with changing
stats_temp_directory on-the-fly has gone away, so remove the caution about
that as well as the thrashing we did to minimize the trouble window.
In passing, also fix pgstat_report_stat() so that we will send a stats
message if we have function call stats but not table stats to report;
this fixes a bug in the recent patch to support function-call stats.
Martin Pihlak
allowed different processes to have different addresses for the shmem segment
in quite a long time, but there were still a few places left that used the
old coding convention. Clean them up to reduce confusion and improve the
compiler's ability to detect pointer type mismatches.
Kris Jurka
and heap_deformtuple in favor of the newer functions heap_form_tuple et al
(which do the same things but use bool control flags instead of arbitrary
char values). Eliminate the former duplicate coding of these functions,
reducing the deprecated functions to mere wrappers around the newer ones.
We can't get rid of them entirely because add-on modules probably still
contain many instances of the old coding style.
Kris Jurka
it just return void instead of sometimes returning a TupleTableSlot. SQL
functions don't need that anymore, and noplace else does either. Eliminating
the return value also means one less hassle for the ExecutorRun hook functions
that will be supported beginning in 8.4.
on non-full-page-image WAL records, and quite arbitrarily, only if there's
less than 20% free space on the page after the insert/update (not on HOT
updates, though). The 20% cutoff should avoid most of the overhead, when
replaying a bulk insertion, for example, while ensuring that pages that
are full are marked as full in the FSM.
This is mostly to avoid the nasty worst case scenario, where you replay
from a PITR archive, and the FSM information in the base backup is really
out of date. If there was a lot of pages that the outdated FSM claims to
have free space, but don't actually have any, the first unlucky inserter
after the recovery would traverse through all those pages, just to find
out that they're full. We didn't have this problem with the old FSM
implementation, because we simply threw the FSM information away on a
non-clean shutdown.
RETURNING clause, not just a SELECT as formerly.
A side effect of this patch is that when a set-returning SQL function is used
in a FROM clause, performance is improved because the output is collected into
a tuplestore within the function, rather than using the less efficient
value-per-call mechanism.
functions into one ReadBufferExtended function, that takes the strategy
and mode as argument. There's three modes, RBM_NORMAL which is the default
used by plain ReadBuffer(), RBM_ZERO, which replaces ZeroOrReadBuffer, and
a new mode RBM_ZERO_ON_ERROR, which allows callers to read corrupt pages
without throwing an error. The FSM needs the new mode to recover from
corrupt pages, which could happend if we crash after extending an FSM file,
and the new page is "torn".
Add fork number to some error messages in bufmgr.c, that still lacked it.
in the Global\ namespace, because it caused permission errors on
a lot of platforms.
We need to come up with something better for 8.4, but for now
revert to the pre-8.3.4 behaviour.