cache lookup in the success case. This won't help much for cases where
the given relation is far down the search path, but it does not hurt in
any cases either; and it requires only a little new code. Per gripe from
Jim Nasby about slowness of \d with many tables.
the parameter's name (if any) as the default column name for SELECT FROM
the function, rather than the function name as previously. I still think
this is a bad idea, but I lost the argument. Force decompilation of
function RTEs to specify full aliases always, to reduce the odds of this
decision breaking dumped views.
predicate_implied_by() to detect redundant filter conditions, but forgot
that predicate_implied_by() assumes its first argument contains only
immutable functions. Add a check to guarantee that. Also, test to see
if filter conditions can be discarded because they are redundant with
the predicate of a partial index.
generated by bitmap index scans. Along the way, simplify and speed up
the code for counting sequential and index scans; it was both confusing
and inefficient to be taking care of that in the per-tuple loops, IMHO.
initdb forced because of internal changes in pg_stat view definitions.
was created on a machine with alignment rules and floating-point format
similar to the current machine. Per recent discussion, this seems like
a good idea with the increasing prevalence of 32/64 bit environments.
argument as a 'regclass' value instead of a text string. The frontend
conversion of text string to pg_class OID is now encapsulated as an
implicitly-invocable coercion from text to regclass. This provides
backwards compatibility to the old behavior when the sequence argument
is explicitly typed as 'text'. When the argument is just an unadorned
literal string, it will be taken as 'regclass', which means that the
stored representation will be an OID. This solves longstanding problems
with renaming sequences that are referenced in default expressions, as
well as new-in-8.1 problems with renaming such sequences' schemas or
moving them to another schema. All per recent discussion.
Along the way, fix some rather serious problems in dbmirror's support
for mirroring sequence operations (int4 vs int8 confusion for instance).
the ProcessUtility case, resulting in an intratransaction memory leak
if a utility command actually did return any tuples, as reported by
Dmitry Karasik. Fix this and also make the behavior more consistent
for cases involving nested SPI operations and multiple query trees,
by ensuring that we store the state locally until it is ready to be
returned to the caller.
only the inner-side relation would be considered as potential equijoin clauses,
which is wrong because the condition doesn't necessarily hold above the point
of the outer join. Per test case from Kevin Grittner (bug#1916).
outer relation is empty did not work, per test case from Patrick Welche.
It tried to use nodeHashjoin.c's high-level mechanisms for fetching an
outer-relation tuple, but that code expected the hash table to be filled
already. As patched, the code failed in corner cases such as having no
outer-relation tuples for the first hash batch. Revert and rewrite.
for int8 and related types. However we might be talking to a client
that has working int64; so pq_getmsgint64 really needs to check the
incoming value and throw an overflow error if we can't represent it
accurately.
"optimization". When we find a potentially useful joinclause, we
have to add all its other required_relids to the result, not only the
other clause_relids. They are different in the case of a joinclause
whose applicability has to be postponed due to outer join. We have
to include the extra rels because otherwise, after best_inner_indexscan
masks the join rels with index_outer_relids, it will always fail to
find the joinclause as applicable. Per report from Husam Tomeh.
in which invalid page data could be transiently written to disk by
concurrent bgwriter activity. There doesn't seem any risk of loss of
actual user data, but an empty page could possibly be left corrupt if a
crash occurs before the correct data gets written out. Pointed out by
Alvaro Herrera.
strings. This is consistent with SQL conventions, and since Bruce
already changed initdb in a way that assumed it worked like this, seems
we'd better make it work like this.
sake of brevity and clarity.
Make pg_reload_conf(), pg_rotate_logfile(), and pg_cancel_backend()
return a boolean rather than an integer to indicate success or failure.
Along the way, make some minor cleanups to dbsize.c -- in particular,
use elog() rather than ereport() for "shouldn't happen" error
conditions, and remove some of the more flagrant violations of the
Postgres indentation conventions.
Catalog version bumped.
bytes. This shouldn't make any difference on x86 machines, where the size
happened to be 16 bytes anyway, but on 64-bit machines and machines with
slock_t int or wider, it will speed array indexing and hopefully reduce
SMP cache contention effects. Per recent experimentation.
recovered. I did not see any actual leak while testing this in CVS tip,
but 8.0 definitely has a problem with leaking the space temporarily
palloc'd by BufferSync(). In any case this seems a good idea to forestall
similar problems in future. Per report from Arjen van der Meijden.
to drop connections unceremoniously. Also some other marginal cleanups:
don't query getsockopt() repeatedly if it fails, and avoid having the
apparent definition of struct Port depend on which system headers you
might have included or not. Oliver Jowett and Tom Lane.
in the zic database or zone names found in the date token table. This
preserves the old ability to do AT TIME ZONE 'PST' along with the new
ability to do AT TIME ZONE 'PST8PDT'. Per gripe from Bricklen Anderson.
Also, fix some inconsistencies in usage of TZ_STRLEN_MAX --- the old
code had the potential for one-byte buffer overruns, though given
alignment considerations it's unlikely there was any real risk.
for procedural languages. This replaces the hard-wired table I had
originally proposed as a stopgap solution. For the moment, the initial
contents only include languages shipped with the core distribution.
as per my recent proposal. For now the template data is hard-wired in
proclang.c --- this should be replaced later by a new shared system
catalog, but we don't want to force initdb during 8.1 beta. This change
lets us cleanly load existing dump files even if they contain outright
wrong information about a PL's support functions, such as a wrong path
to the shared library or a missing validator function. Also, we can
revert the recent kluges to make pg_dump dump PL support functions that
are stored in pg_catalog.
While at it, I removed the code in pg_regress that replaced $libdir
with a hardcoded path for temporary installations. This is no longer
needed given our support for relocatable installations.
when there are extra resjunk columns in the child node. I found some
additional cases involving Append nodes that weren't handled by the
prior patch, and it's not clear how to fix them in the same way without
breaking inheritance cases. So the prudent path seems to be to narrow
the scope of the optimization.
has to recopy the input plan node's targetlist if it removes a
SubqueryScan node just below the non-projecting node. For simplicity
I made it recopy always. Per bug report from Allan Wang and Michael Fuhr.
on a page, as suggested by ITAGAKI Takahiro. Also, change a few places
that were using some other estimates of max-items-per-page to consistently
use MaxOffsetNumber. This is conservatively large --- we could have used
the new MaxHeapTuplesPerPage macro, or a similar one for index tuples ---
but those places are simply declaring a fixed-size buffer and assuming it
will work, rather than actively testing for overrun. It seems safer to
size these buffers in a way that can't overflow even if the page is
corrupt.
saves nearly 700kB in the default shared memory segment size, which seems
worthwhile, and it is a feature that many users won't use anyway. Per
Heikki's argument, there is no point in a compromise value --- those who
are using 2PC at all will probably want it at least equal to max_connections.
But we can't set it to zero by default without breaking the prepared_xacts
regression test.
so that the latter estimates the number of groups that grouping will
produce. This is needed because it is primarily query_planner that
makes the decision between fast-start and fast-finish plans, and in the
original coding it was unable to make more than a crude rule-of-thumb
choice when the query involved grouping. This revision helps us make
saner choices for queries like SELECT ... GROUP BY ... LIMIT, as in a
recent example from Mark Kirkwood. Also move the responsibility for
canonicalizing sort_pathkeys and group_pathkeys into query_planner;
this information has to be available anyway to support the first change,
and doing it this way lets us get rid of compare_noncanonical_pathkeys
entirely.
to copy the whole plan tree before invoking adjust_plan_varnos(); else
if there is any multiply-linked substructure, the latter might increment
some Var's varno twice. Previously there were some retail copyObject
calls inside adjust_plan_varnos, but it seems a lot safer to just dup the
whole tree first. Also, set_inner_join_references was trying to avoid
work by not recursing if a BitmapHeapScan's bitmapqualorig contained no
outer references; which was OK at the time the code was written, I think,
but now that create_bitmap_scan_plan removes duplicate clauses from
bitmapqualorig it is possible for that field to be NULL while outer
references still remain in the qpqual and/or contained indexscan nodes.
For safety, always recurse even if the BitmapHeapScan looks to be outer
reference free. Per reports from Michael Fuhr and Oleg Bartunov.
the parent table, even if the command that creates them is executed by
someone else (such as a superuser or a member of the owning role).
Per gripe from Michael Fuhr.
in interval_mul and interval_div. This avoids an optimization bug
in A Certain Company's compiler (and given their explanation, I wouldn't
be surprised if other compilers blow it too). Besides the code seems
more clear this way --- in the original formulation, you had to mentally
recognize the common subexpression in order to understand what was going
on.
use these instead of its previous hack of changing pg_class.reltriggers.
Documentation is lacking, will add that later.
Patch by Satoshi Nagayasu, review and some extra work by Tom Lane.
after the fact. Fix bug with incorrect test for whether we are at end
of logfile segment. Arrange for writes triggered by XLogInsert's
is-cache-more-than-half-full test to synchronize with the cache boundaries,
so that in long transactions we tend to write alternating halves of the
cache rather than randomly chosen portions of it; this saves one more
write syscall per cache load.
erroring out as it has done for the last couple weeks. Document that this
form is now ignored because indexes can't usefully have different owners
from their parent tables. Fix pg_dump to not generate ALTER OWNER commands
for indexes.
discussion of getting around this by relaxing the checks made for regular
users, but I'm disinclined to toy with the security model right now,
so just special-case it for superusers where needed.
not accepting queries).
errmsg("database is not accepting queries to avoid
wraparound data loss in database \"%s\"",
errhint("Stop the postmaster and use a standalone
backend to VACUUM database \"%s\".",
in postgresql.conf.sample, mark custom_variable_classes as SIGHUP not
POSTMASTER to agree with the documentation (I can't see a reason it has
to be POSTMASTER so I think the docs are right).
to 'Size' (that is, size_t), and install overflow detection checks in it.
This allows us to remove the former arbitrary restrictions on NBuffers
etc. It won't make any difference in a 32-bit machine, but in a 64-bit
machine you could theoretically have terabytes of shared buffers.
(How efficiently we could manage 'em remains to be seen.) Similarly,
num_temp_buffers, work_mem, and maintenance_work_mem can be set above
2Gb on a 64-bit machine. Original patch from Koichi Suzuki, additional
work by moi.
insufficient paranoia in code that follows t_ctid links. (We must do both
because even with VACUUM doing it properly, the intermediate state with
a dangling t_ctid link is visible concurrently during lazy VACUUM, and
could be seen afterwards if either type of VACUUM crashes partway through.)
Also try to improve documentation about what's going on. Patch is a bit
bulky because passing the XMAX information around required changing the
APIs of some low-level heapam.c routines, but it's not conceptually very
complicated. Per trouble report from Teodor and subsequent analysis.
This needs to be back-patched, but I'll do that after 8.1 beta is out.
or OFFSET clauses by using estimate_expression_value(). The main advantage
of this is that if the expression is a Param and we have a value for the
Param, we'll use that value rather than defaulting. Also, fix some
thinkos in the logic for combining LIMIT/OFFSET with an externally
supplied tuple fraction (this covers cases like EXISTS(...LIMIT...)).
And make sure the results of all this are shown by EXPLAIN. Per a
gripe from Merlin Moncure.
Fix to_char(interval) to return large year/month/day/hour values that
are larger than possible timestamp values.
Prevent to_char(interval) format specifications that make no sense, like
Month.
Clean up formatting.c code to more logically handle return lengths.
their OID parameter. It was possible to crash the backend with
select array_in('{123}',0,0); because that would bypass the needed step
of initializing the workspace. These seem to be the only two places
with a problem, though (record_in and record_recv don't have the issue,
and the other array functions aren't depending on user-supplied input).
Back-patch as far as 7.4; 7.3 does not have the bug.
(the stats system has always collected this info, but the views were
filtering it out). Modify autovacuum so that over-threshold activity
in a toast table can trigger a VACUUM of the parent table, even if the
parent didn't appear to need vacuuming itself. Per discussion a month
or so back about "short, wide tables".
SearchCatCacheList and ReleaseCatCacheList. Previously, we incremented
and decremented the refcounts of list member tuples along with the list
itself, but that's unnecessary, and very expensive when the list is big.
It's cheaper to change only the list refcount. When we are considering
deleting a cache entry, we have to check not only its own refcount but
its parent list's ... but it's easy to arrange the code so that this
check is not made in any commonly-used paths, so the cost is really nil.
The bigger gain though is to refrain from DLMoveToFront'ing each individual
member tuple each time the list is referenced. To keep some semblance
of fair space management, lists are just marked as used or not since the
last cache cleanout search, and we do a MoveToFront pass only when about
to run a cleanout. In combination, these changes reduce the costs of
SearchCatCacheList and ReleaseCatCacheList from about 4.5% of pgbench
runtime to under 1%, according to my gprof results.
remember the output parameter set for himself. It's a bit of a kluge
but fixing array_in to work in bootstrap mode looks worse.
I removed the separate pg_file_length() function, as it no longer has any
real notational advantage --- you can write (pg_stat_file(...)).length.
stderr-in-service or output-from-syslogger-in-service code. Previously
everything was flagged as ERRORs there, which caused all instances to
log "LOG: logger shutting down" as error...
Please apply for 8.1. I'd also like it considered for 8.0 since logging
non-errors as errors can be cause for alarm amongst people who actually
look at their logs...
Magnus Hagander
> >> 3) I restarted the postmaster both times. I got this error
> both times.
> >> :25: ERROR: could not load library "C:/Program
> >> Files/PostgreSQL/8.0/lib/testtrigfuncs.dll": dynamic load error
>
> > Yes. We really need to look at fixing that error message. I had
> > forgotten it completely :-(
>
> > Bruce, you think we can sneak that in after feature freeze? I would
> > call it a bugfix :-)
>
> Me too. That's been on the radar for awhile --- please do
> send in a patch.
Here we go, that wasn't too hard :-)
Apart from adding the error handling, it does one more thing: it changes
the errormode when loading the DLLs. Previously if a DLL was broken, or
referenced other DLLs that couldn't be found, a popup dialog box would
appear on the screen. Which had to be clicked before the backend could
continue. This patch also disables the popup error message for DLL
loads.
I think this is something we should consider doing for the entire
backend - disable those popups, and say we deal with it ourselves. What
do you other win32 hackers thinnk about this?
In the meantime, this patch fixes the error msgs. Please apply for 8.1
and please consider a backpatch to 8.0.
Magnus Hagander
> > I ran across this yesterday on HEAD:
>
> > template1=# grant select on foo, foo to swm;
> > ERROR: tuple already updated by self
>
> Seems to fail similarly in every version back to 7.2; probably further,
> but that's all I have running at the moment.
>
> > We could do away with the error by producing a unique list of object names
> > -- but that would impose an extra cost on the common case.
>
> CommandCounterIncrement in the GRANT loop would be easier, likely.
> I'm having a hard time getting excited about it though...
Yeah, its not that exciting but that error message would throw your
average user.
I've attached a patch which calls CommandCounterIncrement() in each of the
grant loops.
Gavin Sherry
should surely be timestamptz not timestamp; fix some but not all of the
holes in check_and_make_absolute(); other minor cleanup. Also put in
the missed catversion bump.
computation. On modern machines this is as fast if not faster, and we
don't have to clog the CPU's L2 cache with a tens-of-KB pointer array.
If we ever decide to adopt a more dynamic allocation method for shared
buffers, we'll probably have to revert this patch, but in the meantime
we might as well save a few bytes and nanoseconds. Per Qingqing Zhou.
whenever we generate a new OID. This prevents occasional duplicate-OID
errors that can otherwise occur once the OID counter has wrapped around.
Duplicate relfilenode values are also checked for when creating new
physical files. Per my recent proposal.
delay and limit, both as global GUCs and as table-specific entries in
pg_autovacuum. stats_reset_on_server_start is now OFF by default,
but a reset is forced if we did WAL replay. XID-wrap vacuums do not
ANALYZE, but do FREEZE if it's a template database. Alvaro Herrera
against the PGPROC array. Anything in the file that isn't in PGPROC
gets rejected as being a stale entry. This should solve complaints about
stale entries in pg_stat_activity after a BETERM message has been dropped
due to overload.
ResourceOwner mechanism already released all reference counts for the
cache entries; therefore, we do not need to scan the catcache or relcache
at transaction end, unless we want to do it as a debugging crosscheck.
Do the crosscheck only in Assert mode. This is the same logic we had
previously installed in AtEOXact_Buffers to avoid overhead with large
numbers of shared buffers. I thought it'd be a good idea to do it here
too, in view of Kari Lavikka's recent report showing a real-world case
where AtEOXact_CatCache is taking a significant fraction of runtime.
exit, instead of trying to take shortcuts. Introduce some additional
shutdown callback routines to eliminate kluges like having ProcKill
be responsible for shutting down the buffer manager. Ensure that the
order of operations during shutdown is predictable and what you would
expect given the module layering.
max_files_per_process. Going further than that is just a waste of
cycles, and it seems that current Cygwin does not cope gracefully
with deliberately running the system out of FDs. Per Andrew Dunstan.
character, tighten the inner loops of CopyReadLine and CopyReadAttribute,
arrange to parse out all the attributes of a line in just one call instead
of one CopyReadAttribute call per attribute, be smarter about which client
encodings require slow pg_encoding_mblen() loops. Also, clean up the
mishmash of static variables and overly-long parameter lists in favor of
passing around a single CopyState struct containing all the state data.
Original patch by Alon Goldshuv, reworked by Tom Lane.
This was not especially critical before, but it is now that we track
ownership dependencies --- the dependency for the rowtype *must* shift
to the new owner. Spotted by Bernd Helmle.
Also fix a problem introduced by recent change to allow non-superusers
to do ALTER OWNER in some cases: if the table had a toast table, ALTER
OWNER failed *even for superusers*, because the test being applied would
conclude that the new would-be owner had no create rights on pg_toast.
A side-effect of the fix is to disallow changing the ownership of indexes
or toast tables separately from their parent table, which seems a good
idea on the whole.
doesn't block the bgwriter from making progress writing out other buffers.
This was a hard problem in the context of the ARC/2Q design, but it's
trivial in the context of clock sweep ... just advance the sweep counter
before we try to write not after.
of special case for Windows port. Put a PG_TRY around most of createdb()
to ensure that we remove copied subdirectories on failure, even if the
failure happens while creating the pg_database row. (I think this explains
Oliver Siegmar's recent report.) Having done that, there's no need for
the fragile assumption that copydir() mustn't ereport(ERROR), so simplify
its API. Eliminate the old code that used system("cp ...") to copy
subdirectories, in favor of using copydir() on all platforms. This not
only should allow much better error reporting, but allows us to fsync
the created files before trusting that the copy has succeeded.
continue to recurse after eliminating a NOT-below-a-NOT, since the
contained subexpression will now be part of the top-level AND/OR structure
and so deserves to be simplified. The real-world impact of this is
probably minimal, since it'd require at least three levels of NOT to make
a difference, but it's still a bug.
Also remove some redundant tests for NULL subexpressions.
track shared relations in a separate hashtable, so that operations done
from different databases are counted correctly. Add proper support for
anti-XID-wraparound vacuuming, even in databases that are never connected
to and so have no stats entries. Miscellaneous other bug fixes.
Alvaro Herrera, some additional fixes by Tom Lane.
Also, write multiple WAL buffers out in one write() operation.
ITAGAKI Takahiro
---------------------------------------------------------------------------
> If we disable writeback-cache and use open_sync, the per-page writing
> behavior in WAL module will show up as bad result. O_DIRECT is similar
> to O_DSYNC (at least on linux), so that the benefit of it will disappear
> behind the slow disk revolution.
>
> In the current source, WAL is written as:
> for (i = 0; i < N; i++) { write(&buffers[i], BLCKSZ); }
> Is this intentional? Can we rewrite it as follows?
> write(&buffers[0], N * BLCKSZ);
>
> In order to achieve it, I wrote a 'gather-write' patch (xlog.gw.diff).
> Aside from this, I'll also send the fixed direct io patch (xlog.dio.diff).
> These two patches are independent, so they can be applied either or both.
>
>
> I tested them on my machine and the results as follows. It shows that
> direct-io and gather-write is the best choice when writeback-cache is off.
> Are these two patches worth trying if they are used together?
>
>
> | writeback | fsync= | fdata | open_ | fsync_ | open_
> patch | cache | false | sync | sync | direct | direct
> ------------+-----------+--------+-------+-------+--------+---------
> direct io | off | 124.2 | 105.7 | 48.3 | 48.3 | 48.2
> direct io | on | 129.1 | 112.3 | 114.1 | 142.9 | 144.5
> gather-write| off | 124.3 | 108.7 | 105.4 | (N/A) | (N/A)
> both | off | 131.5 | 115.5 | 114.4 | 145.4 | 145.2
>
> - 20runs * pgbench -s 100 -c 50 -t 200
> - with tuning (wal_buffers=64, commit_delay=500, checkpoint_segments=8)
> - using 2 ATA disks:
> - hda(reiserfs) includes system and wal.
> - hdc(jfs) includes database files. writeback-cache is always on.
>
> ---
> ITAGAKI Takahiro
An attached patch is a small additional improvement.
This patch use appendStringInfoText instead of appendStringInfoString.
There is an overhead of PG_TEXT_GET_STR when appendStringInfoString is
executed by text type. This can be reduced by appendStringInfoText.
Atsushi Ogawa
planning logic for bitmap indexscans. Partial indexes create corner
cases in which a scan might be done with no explicit index qual conditions,
and the code wasn't handling those cases nicely. Also be a little
tenser about eliminating redundant clauses in the generated plan.
Per report from Dmitry Karasik.
to the "text" segment. It would be possible to mark the elements of the
array "const" as well, but this would require multiple API changes and
does not seem to be worth the notational inconvenience.
into .def and .exp files automatically on Windows, AIX, and the like.
An additional benefit is that changes in libpgport files correctly
propagate to force rebuild of the backend executable. This is my
reworking of Rocco Altier's idea, and if it breaks anything it's
definitely my fault.
doesn't automatically inherit the privileges of roles it is a member of;
for such a role, membership in another role can be exploited only by doing
explicit SET ROLE. The default inherit setting is TRUE, so by default
the behavior doesn't change, but creating a user with NOINHERIT gives closer
adherence to our current reading of SQL99. Documentation still lacking,
and I think the information schema needs another look.
existing ones for object privileges. Update the information_schema for
roles --- pg_has_role() makes this a whole lot easier, removing the need
for most of the explicit joins with pg_user. The views should be a tad
faster now, too. Stephen Frost and Tom Lane.
pg_strcasecmp and pg_strncasecmp ... but I see some of the former have
crept back in.
Eternal vigilance is the price of locale independence, apparently.
near daylight savings time boudaries. This handles it properly, e.g.
test=> select '2005-04-03 04:00:00'::timestamp at time zone
'America/Los_Angeles';
timezone
------------------------
2005-04-03 07:00:00-04
(1 row)
test=> select (CURRENT_DATE + '05:00'::time)::timestamp at time zone
'Canada/Pacific';
timezone
------------------------
2005-07-22 08:00:00-04
(1 row)
test=> select ('2005-07-20 00:00:00'::timestamp without time zone) at
time zone 'Europe/Paris';
timezone
------------------------
2005-07-19 22:00:00-04
Udpate documentation.
coding would ignore startup cost differences of less than 1% of the
estimated total cost; which was OK for normal planning but highly not OK
if a very small LIMIT was applied afterwards, so that startup cost becomes
the name of the game. Instead, compare startup and total costs fuzzily
but independently. This changes the plan selected for two queries in the
regression tests; adjust expected-output files for resulting changes in
row order. Per reports from Dawid Kuroczko and Sam Mason.
24 hours. This is very helpful for daylight savings time:
select '2005-05-03 00:00:00 EST'::timestamp with time zone + '24 hours';
?column?
----------------------
2005-05-04 01:00:00-04
select '2005-05-03 00:00:00 EST'::timestamp with time zone + '1 day';
?column?
----------------------
2005-05-04 01:00:00-04
Michael Glaesemann
test=> select '4 months'::interval / 5;
?column?
---------------
1 mon -6 days
(1 row)
after:
test=> select '4 months'::interval / 5;
?column?
----------
24 days
(1 row)
The problem was the use of rint() to round, and then find the remainder,
causing the negative values.
output targetlist of the Unique or HashAgg plan. This code was OK when
written, but subsequent changes to use "physical tlists" where possible
had broken it: given an input subplan that has extra variables added to
avoid a projection step, it would copy those extra variables into the
upper tlist, which is pointless since a projection has to happen anyway.
I have seen this case in CVS tip due to new "physical tlist" optimization
for subqueries. I believe it probably can't happen in existing releases,
but the check is not going to hurt anything, so backpatch to 8.0 just
in case.
cases: we can't just consider whether the subquery's output is unique on its
own terms, we have to check whether the set of output columns we are going to
use will be unique. Per complaint from Luca Pireddu and test case from
Michael Fuhr.
requiring superuserness always, allow an owner to reassign ownership
to any role he is a member of, if that role would have the right to
create a similar object. These three requirements essentially state
that the would-be alterer has enough privilege to DROP the existing
object and then re-CREATE it as the new role; so we might as well
let him do it in one step. The ALTER TABLESPACE case is a bit
squirrely, but the whole concept of non-superuser tablespace owners
is pretty dubious anyway. Stephen Frost, code review by Tom Lane.
optional arguments as text input functions, ie, typioparam OID and
atttypmod. Make all the datatypes that use typmod enforce it the same
way in typreceive as they do in typinput. This fixes a problem with
failure to enforce length restrictions during COPY FROM BINARY.
The specification of this function is as follows.
regexp_replace(source text, pattern text, replacement text, [flags
text])
returns text
Replace string that matches to regular expression in source text to
replacement text.
- pattern is regular expression pattern.
- replacement is replace string that can use '\1'-'\9', and '\&'.
'\1'-'\9': back reference to the n'th subexpression.
'\&' : entire matched string.
- flags can use the following values:
g: global (replace all)
i: ignore case
When the flags is not specified, case sensitive, replace the first
instance only.
Atsushi Ogawa
XLOG_DBASE_DROP_OLD WAL records -- these records are no longer created in
current sources. Adjust numbering of XLOG_DBASE_CREATE and XLOG_DBASE_DROP
and bump the catversion. Patch from Gavin Sherry, adjusted by Neil Conway.
have adequate mechanisms for tracking the contents of databases and
tablespaces). This solves the longstanding problem that you can drop a
user who still owns objects and/or has access permissions.
Alvaro Herrera, with some kibitzing from Tom Lane.
The content of the patch is as follows:
(1)Create shortcut when subtext was not found.
(2)Stop using LEFT and RIGHT macro.
In LEFT and RIGHT macro, TEXTPOS is executed by the same content as
execution immediately before. The execution frequency of TEXTPOS can be
reduced by using text_substring instead of LEFT and RIGHT macro.
(3)Add appendStringInfoText, and use it instead of
appendStringInfoString.
There is an overhead of PG_TEXT_GET_STR when appendStringInfoString is
executed by text type. This can be reduced by appendStringInfoText.
(4)Reduce execution of TEXTDUP.
The effect of the patch that I measured is as follows:
- The Data for test was created by 'pgbench -i'.
- Test SQL:
select replace(aid, '9', 'A') from accounts;
- Test results: Linux(CPU: Pentium III, Compiler option: -O2)
original: 1.515s
patched: 1.250s
Atsushi Ogawa
chdir into PGDATA and subsequently use relative paths instead of absolute
paths to access all files under PGDATA. This seems to give a small
performance improvement, and it should make the system more robust
against naive DBAs doing things like moving a database directory that
has a live postmaster in it. Per recent discussion.
able to do this before, but I had tried to make an exception for functions
with OUT parameters. Michael Fuhr found one problem with it already, and
I found another, which was it didn't work for strict functions with a
NULL input. While both of these could be worked around, the probability
that there are more gotchas seems high; I think prudence dictates just
reverting to the former behavior for now. Accordingly, remove the kluge
added to get_expr_result_type() for Michael's case.
propagated inside an outer join. In particular, given
LEFT JOIN ON (A = B) WHERE A = constant, we cannot conclude that
B = constant at the top level (B might be null instead), but we
can nonetheless put a restriction B = constant into the quals for
B's relation, since no inner-side rows not meeting that condition
can contribute to the final result. Similarly, given
FULL JOIN USING (J) WHERE J = constant, we can't directly conclude
that either input J variable = constant, but it's OK to push such
quals into each input rel. Per recent gripe from Kim Bisgaard.
Along the way, remove 'valid_everywhere' flag from RestrictInfo,
as on closer analysis it was not being used for anything, and was
defined backwards anyway.
basic regression tests for GiST to the standard regression tests.
I took the opportunity to add an rtree-equivalent gist opclass for
circles; the contrib version only covered boxes and polygons, but
indexing circles is very handy for distance searches.
the difference between checkpoints forced due to WAL segment consumption
and checkpoints forced for other reasons (such as CREATE DATABASE). Avoid
generating 'checkpoints are occurring too frequently' messages when the
checkpoint wasn't caused by WAL segment consumption. Per gripe from
Chris K-L.
current time: provide a GetCurrentTimestamp() function that returns
current time in the form of a TimestampTz, instead of separate time_t
and microseconds fields. This is what all the callers really want
anyway, and it eliminates low-level dependencies on AbsoluteTime,
which is a deprecated datatype that will have to disappear eventually.
role memberships; make superuser/createrole distinction do something
useful; fix some locking and CommandCounterIncrement issues; prevent
creation of loops in the membership graph.
syntactic conflicts, both privilege and role GRANT/REVOKE commands have
to use the same production for scanning the list of tokens that might
eventually turn out to be privileges or role names. So, change the
existing GRANT/REVOKE code to expect a list of strings not pre-reduced
AclMode values. Fix a couple other minor issues while at it, such as
InitializeAcl function name conflicting with a Windows system function.
and pg_auth_members. There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance). But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies. The catalog changes should
be pretty much done.
- full concurrency for insert/update/select/vacuum:
- select and vacuum never locks more than one page simultaneously
- select (gettuple) hasn't any lock across it's calls
- insert never locks more than two page simultaneously:
- during search of leaf to insert it locks only one page
simultaneously
- while walk upward to the root it locked only parent (may be
non-direct parent) and child. One of them X-lock, another may
be S- or X-lock
- 'vacuum full' locks index
- improve gistgetmulti
- simplify XLOG records
Fix bug in index_beginscan_internal: LockRelation may clean
rd_aminfo structure, so move GET_REL_PROCEDURE after LockRelation
with a table that has a small predicted size. Avoids wasting several
hundred K on the timezone hash table, which is likely to have only one
or a few entries, but the entries use up 10Kb apiece ...
with main, avoid using a SQL-defined SQLSTATE for what is most definitely
not a SQL-compatible error condition, fix documentation omissions,
adhere to message style guidelines, don't use two GUC_REPORT variables
when one is sufficient. Nothing done about pg_dump issues.
literally.
Add GUC variables:
"escape_string_warning" - warn about backslashes in non-E strings
"escape_string_syntax" - supports E'' syntax?
"standard_compliant_strings" - treats backslashes literally in ''
Update code to use E'' when escapes are used.
should fix the recent reports of "index is not a btree" failures,
as well as preventing a more obscure race condition involving changes
to a template database just after copying it with CREATE DATABASE.
to the existing X-direction tests. An rtree class now includes 4 actual
2-D tests, 4 1-D X-direction tests, and 4 1-D Y-direction tests.
This involved adding four new Y-direction test operators for each of
box and polygon; I followed the PostGIS project's lead as to the names
of these operators.
NON BACKWARDS COMPATIBLE CHANGE: the poly_overleft (&<) and poly_overright
(&>) operators now have semantics comparable to box_overleft and box_overright.
This is necessary to make r-tree indexes work correctly on polygons.
Also, I changed circle_left and circle_right to agree with box_left and
box_right --- formerly they allowed the boundaries to touch. This isn't
actually essential given the lack of any r-tree opclass for circles, but
it seems best to sync all the definitions while we are at it.
CURRENT_TIME, and LOCALTIME: now they just produce "timestamptz" not
"timestamptz(6)", etc. This makes the behavior more consistent with our
choice to not assign a specific default precision to column datatypes.
It should also save a few cycles at runtime due to not having to invoke
the round-to-given-precision functions.
I also took the opportunity to translate CURRENT_TIMESTAMP into "now()"
instead of an invocation of the timestamptz input converter --- this should
save a few cycles too.
polygon operators (<<, &<, >>, &>). Per ideas originally put forward
by andrew@supernews and later rediscovered by moi. This patch just
fixes the existing opclasses, and does not add any new behavior as I
proposed earlier; that can be sorted out later. In principle this
could be back-patched, since it changes only search behavior and not
system catalog entries nor rtree index contents. I'm not currently
planning to do that, though, since I think it could use more testing.
in the database. The old behavior (reindex system catalogs only) is now
available as REINDEX SYSTEM. I did not add the complementary REINDEX USER
case since there did not seem to be consensus for this, but it would be
trivial to add later. Per recent discussions.
argument list contains parameter symbols ($n) declared as type VOID,
discard these arguments. This allows the driver to avoid renumbering
mixed IN and OUT argument placeholders (the JDBC syntax involves writing
? for both IN and OUT parameters, but on the server side we don't think
that OUT parameters are arguments). This doesn't break any currently-
useful cases since VOID is not used as an input argument type.
is redundant after a check has already been made for "num < 0". The "set"
variable can also be removed, as it is now no longer used. Per checking
with Karel, this is the right fix.
Per Coverity static analysis performed by EnterpriseDB.
unlike template0 and template1 does not have any special status in
terms of backend functionality. However, all external utilities such
as createuser and createdb now connect to "postgres" instead of
template1, and the documentation is changed to encourage people to use
"postgres" instead of template1 as a play area. This should fix some
longstanding gotchas involving unexpected propagation of database
objects by createdb (when you used template1 without understanding
the implications), as well as ameliorating the problem that CREATE
DATABASE is unhappy if anyone else is connected to template1.
Patch by Dave Page, minor editing by Tom Lane. All per recent
pghackers discussions.
malformed ident map file. This was introduced by the linked list
rewrite in 8.0 -- mea maxima culpa.
Per Coverity static analysis performed by EnterpriseDB.
only used in one branch of an if statement, so we can move its
declaration to that block. This also avoids an unnecessary syscache
lookup.
Per Coverity static analysis performed by EnterpriseDB.
(due to the preceding strlen(), for example), so we needn't recheck this
before invoking pg_mbcliplen().
Per Coverity static analysis performed by EnterpriseDB.
(a/k/a SELECT INTO). Instead, flush and fsync the whole relation before
committing. We do still need the WAL log when PITR is active, however.
Simon Riggs and Tom Lane.
2. improve vacuum for gist
- use FSM
- full vacuum:
- reforms parent tuple if it's needed
( tuples was deleted on child page or parent tuple remains invalid
after crash recovery )
- truncate index file if possible
3. fixes bugs and mistakes
scankeys arrays that it needs can never have more than INDEX_MAX_KEYS
entries, so it's reasonable to just allocate them as fixed-size local
arrays, and save the cost of palloc/pfree. Not a huge savings, but
a cycle saved is a cycle earned ...
includes error checking and an appropriate ereport(ERROR) message.
This gets rid of rather tedious and error-prone manipulation of errno,
as well as a Windows-specific bug workaround, at more than a dozen
call sites. After an idea in a recent patch by Heikki Linnakangas.
given reasonably short lifespans for prepared transactions, this should
mean that only a small minority of state files ever need to be fsynced
at all. Per discussion with Heikki Linnakangas.
not memcpy() to copy the offered key into the hash table during HASH_ENTER.
This avoids possible core dump if the passed key is located very near the
end of memory. Per report from Stefan Kaltenbrunner.
old suggestion by Oliver Jowett. Also, add a transaction column to the
pg_locks view to show the xid of each transaction holding or awaiting
locks; this allows prepared transactions to be properly associated with
the locks they own. There was already a column named 'transaction',
and I chose to rename it to 'transactionid' --- since this column is
new in the current devel cycle there should be no backwards compatibility
issue to worry about.
work if either of the join relations are empty. The logic is:
(1) if the inner relation's startup cost is less than the outer
relation's startup cost and this is not an outer join, read
a single tuple from the inner relation via ExecHash()
- if NULL, we're done
(2) read a single tuple from the outer relation
- if NULL, we're done
(3) build the hash table on the inner relation
- if hash table is empty and this is not an outer join,
we're done
(4) otherwise, do hash join as usual
The implementation uses the new MultiExecProcNode API, per a
suggestion from Tom: invoking ExecHash() now produces the first
tuple from the Hash node's child node, whereas MultiExecHash()
builds the hash table.
I had to put in a bit of a kludge to get the row count returned
for EXPLAIN ANALYZE to be correct: since ExecHash() is invoked to
return a tuple, and then MultiExecHash() is invoked, we would
return one too many tuples to EXPLAIN ANALYZE. I hacked around
this by just manually detecting this situation and subtracting 1
from the EXPLAIN ANALYZE row count.
"AT TIME ZONE", and not just the shorlist previously available. For
example:
SELECT CURRENT_TIMESTAMP AT TIME ZONE 'Europe/London';
works fine now. It will also obey whatever DST rules were in effect at
just that date, which the previous implementation did not.
It also supports the AT TIME ZONE on the timetz datatype. The whole
handling of DST is a bit bogus there, so I chose to make it use whatever
DST rules are in effect at the time of executig the query. not sure if
anybody is actuallyi *using* timetz though, it seems pretty
unpredictable just because of this...
Magnus Hagander
it is sufficient to track whether a backend holds a lock or not, and
store information about transaction vs. session locks only in the
inside-the-backend LocalLockTable. Since there can now be but one
PROCLOCK per lock per backend, LockCountMyLocks() is no longer needed,
thus eliminating some O(N^2) behavior when a backend holds many locks.
Also simplify the LockAcquire/LockRelease API by passing just a
'sessionLock' boolean instead of a transaction ID. The previous API
was designed with the idea that per-transaction lock holding would be
important for subtransactions, but now that we have subtransactions we
know that this is unwanted. While at it, add an 'isTempObject' parameter
to LockAcquire to indicate whether the lock is being taken on a temp
table. This is not used just yet, but will be needed shortly for
two-phase commit.
part of service principal. If not set, any service principal matching
an entry in the keytab can be used.
NEW KERBEROS MATCHING BEHAVIOR FOR 8.1.
Todd Kover
if geqo_rand() returns exactly 1.0, resulting in failure due to indexing
off the end of the pool array. Also, since this is using inexact float math,
it seems wise to guard against roundoff error producing values slightly
outside the expected range. Per report from bug@zedware.org.
constraint while determining whether the index sort order matches the
query's ORDER BY. This for example allows an index on (x,y) to match
... WHERE x = 42 ORDER BY y;
It only works for btree indexes, but since those are the only ones we
currently have that are ordered at all, that's good enough for now.
Per popular demand.
nonconsecutive columns of a multicolumn index, as per discussion around
mid-May (pghackers thread "Best way to scan on-disk bitmaps"). This
turns out to require only minimal changes in btree, and so far as I can
see none at all in GiST. btcostestimate did need some work, but its
original assumption that index selectivity == heap selectivity was
quite bogus even before this.
a descriptor that uses the current transaction snapshot, rather than
SnapshotNow as it did before (and still does if INV_WRITE is set).
This means pg_dump will now dump a consistent snapshot of large object
contents, as it never could do before. Also, add a lo_create() function
that is similar to lo_creat() but allows the desired OID of the large
object to be specified. This will simplify pg_restore considerably
(but I'll fix that in a separate commit).
These contain the SQLSTATE and error message of the current exception,
respectively. They are scope-local variables that are only defined
in exception handlers (so attempting to reference them outside an
exception handler is an error). Update the regression tests and the
documentation.
Also, do some minor related cleanup: export an unpack_sql_state()
function from the backend and use it to unpack a SQLSTATE into a
string, and add a free_var() function to pl_exec.c
Original patch from Pavel Stehule, review by Neil Conway.
to a subquery if the outer query is simple enough that the LIMIT can
be reflected directly to the subquery. This didn't use to be very
interesting, because a subquery that couldn't have been flattened into
the upper query was usually not going to be very responsive to
tuple_fraction anyway. But with new code that allows UNION ALL subqueries
to pay attention to tuple_fraction, this is useful to do. In particular
this lets the optimization occur when the UNION ALL is directly inside
a view.
if the limit were directly applied to it. This does not actually
add a LIMIT plan node to the generated subqueries --- that would be
useless overhead --- but it does cause the planner to prefer fast-
start plans when the limit is small. After an idea from Phil Endecott.
this in turn causes CREATE TABLE AS in plpgsql to set ROW_COUNT.
This is how it behaved before 7.4; I had unintentionally changed the
behavior in a bit of sloppy micro-optimization.
of a relation in a flat 'joininfo' list. The former arrangement grouped
the join clauses according to the set of unjoined relids used in each;
however, profiling on test cases involving lots of joins proves that
that data structure is a net loss. It takes more time to group the
join clauses together than is saved by avoiding duplicate tests later.
It doesn't help any that there are usually not more than one or two
clauses per group ...
as well as the existing pg_catalog entries for prefix and postfix %.
These have never been documented, though they did appear in one old
regression test. This avoids surprising behavior in cases like
"SELECT -25 % -10". Per recent discussion.
Note: although there is a catalog change here, I did not force initdb
since there's no harm in leaving the inaccessible entries in one's
copy of pg_operator.
transaction IDs, rather than like subtrans; in particular, the information
now survives a database restart. Per previous discussion, this is
essential for PITR log shipping and for 2PC.
last nextval() or setval() performed by the current session. Update the
docs, add regression tests, and bump the catalog version. Patch from
Dennis Björklund, various improvements by Neil Conway.
---------------------------------------------------------------------------
While playing around, I got the following error message:
--
FATAL: pre-existing shared memory block (key 5432001, ID 90898435) is
still in use
HINT: If you're sure there are no old server processes still running,
remove the shared memory block with the command "ipcrm", or just delete
the file "/home/hlinnaka/pgsql/data/postmaster.pid".
---
Thats normal because I used "kill -9 postmaster" to shut down.
The hint advises me to use "ipcrm", but there's the "ipcclean" script in
bin for just this purpose. The hint should probably advise to use
ipcclean.
The attached patch replaces all occurances of "ipcrm" with "ipcclean" in
src/backend/utils/init/miscinit.c and all the translations in
src/backend/po.
While reviewing the patch, I noticed a likely typo in hr.po. While I
don't
speak Croatian, the translation seems to advise to use the "icpm(1)"
command. I changed that to "ipcclean" too.
Heikki Linnakangas
up have the standard layout with unused space between pd_lower and pd_upper.
When this is set, XLogInsert will omit the unused space without bothering
to scan it to see if it's zero. That saves time in XLogInsert, and also
allows reversion of my earlier patch to make PageRepairFragmentation et al
explicitly re-zero freed space. Per suggestion by Heikki Linnakangas.
other_rel_list with a single array indexed by rangetable index.
This reduces find_base_rel from O(N) to O(1) without any real penalty.
While find_base_rel isn't one of the major bottlenecks in any profile
I've seen so far, it was starting to creep up on the radar screen
for complex queries --- so might as well fix it.
a new PlannerInfo struct, which is passed around instead of the bare
Query in all the planning code. This commit is essentially just a
code-beautification exercise, but it does open the door to making
larger changes to the planner data structures without having to muck
with the widely-known Query struct.
representation as the jointree) with two lists of RTEs, one showing
the RTEs accessible by qualified names, and the other showing the RTEs
accessible by unqualified names. I think this is conceptually simpler
than what we did before, and it's sure a whole lot easier to search.
This seems to eliminate the parse-time bottleneck for deeply nested
JOIN structures that was exhibited by phil@vodafone.
---------------------------------------------------------------------------
Tom Lane <tgl@sss.pgh.pa.us> writes:
> a_ogawa <a_ogawa@hi-ho.ne.jp> writes:
> > It is a reasonable idea. However, the majority part of MemSet was not
> > able to be avoided by this idea. Because the per-tuple contexts are used
> > at the early stage of executor.
>
> Drat. Well, what about changing that? We could introduce additional
> contexts or change the startup behavior so that the ones that are
> frequently reset don't have any data in them unless you are working
> with pass-by-ref values inside the inner loop.
That might be possible. However, I think that we should change only
aset.c about this article.
I thought further: We can check whether context was used from the last
reset even when blocks list is not empty. Please see attached patch.
postgresql.conf.
---------------------------------------------------------------------------
Here's an updated version of the patch, with the following changes:
1) No longer uses "service name" as "application version". It's instead
hardcoded as "postgres". It could be argued that this part should be
backpatched to 8.0, but it doesn't make a big difference until you can
start changing it with GUC / connection parameters. This change only
affects kerberos 5, not 4.
2) Now downcases kerberos usernames when the client is running on win32.
3) Adds guc option for "krb_caseins_users" to make the server ignore
case mismatch which is required by some KDCs such as Active Directory.
Off by default, per discussion with Tom. This change only affects
kerberos 5, not 4.
4) Updated so it doesn't conflict with the rendevouz/bonjour patch
already in ;-)
Magnus Hagander
> a_ogawa <a_ogawa@hi-ho.ne.jp> writes:
> > It is a reasonable idea. However, the majority part of MemSet was not
> > able to be avoided by this idea. Because the per-tuple contexts are used
> > at the early stage of executor.
>
> Drat. Well, what about changing that? We could introduce additional
> contexts or change the startup behavior so that the ones that are
> frequently reset don't have any data in them unless you are working
> with pass-by-ref values inside the inner loop.
That might be possible. However, I think that we should change only
aset.c about this article.
I thought further: We can check whether context was used from the last
reset even when blocks list is not empty. Please see attached patch.
The effect of the patch that I measured is as follows:
o Execution time that executed the SQL ten times.
(1)Linux(CPU: Pentium III, Compiler option: -O2)
- original: 24.960s
- patched : 23.114s
(2)Linux(CPU: Pentium 4, Compiler option: -O2)
- original: 8.730s
- patched : 7.962s
(3)Solaris(CPU: Ultra SPARC III, Compiler option: -O2)
- original: 37.0s
- patched : 33.7s
Atsushi Ogawa (a_ogawa)
RTE of interest, rather than the whole rangetable list. This makes
the API more understandable and avoids duplicate RTE lookups. This
patch reverts no-longer-needed portions of my patch of 2004-08-19.
performance problem pointed out by phil@vodafone: to wit, we were
spending O(N^2) time to check dropped-ness in an N-deep join tree,
even in the case where the tree was freshly constructed and couldn't
possibly mention any dropped columns. Instead of recursing in
get_rte_attribute_is_dropped(), change the data structure definition:
the joinaliasvars list of a JOIN RTE must have a NULL Const instead
of a Var at any position that references a now-dropped column. This
costs nothing during normal parse-rewrite-plan path, and instead we
have a linear-time update to make when loading a stored rule that
might contain now-dropped columns. While at it, move the responsibility
for acquring locks on relations referenced by rules into this separate
function (which I therefore chose to call AcquireRewriteLocks).
This saves effort --- namely, duplicated lock grabs in parser and rewriter
--- in the normal path at a cost of one extra non-locked heap_open()
in the stored-rule path; seems a good tradeoff. A fringe benefit is
that it is now *much* clearer that we acquire lock on relations referenced
in rules before we make any rewriter decisions based on their properties.
(I don't know of any bug of that ilk, but it wasn't exactly clear before.)
to just around the bare recv() call that gets a command from the client.
The former placement in PostgresMain was unsafe because the intermediate
processing layers (especially SSL) use facilities such as malloc that are
not necessarily re-entrant. Per report from counterstorm.com.
Instead of a separate CRC on each backup block, include backup blocks
in their parent WAL record's CRC; this is important to ensure that the
backup block really goes with the WAL record, ie there was not a page
tear right at the start of the backup block. Implement a simple form
of compression of backup blocks: drop any run of zeroes starting at
pd_lower, so as not to store the unused 'hole' that commonly exists in
PG heap and index pages. Tweak PageRepairFragmentation and related
routines to ensure they keep the unused space zeroed, so that the above
compression method remains effective. All per recent discussions.
and RelationNameGetTupleDesc() as deprecated; remove uses of the
latter in the contrib library. Along the way, clean up crosstab()
code and documentation a little.
key, compare the new and old row versions. If the foreign key column has
not changed, we needn't enqueue the trigger, since the update cannot
violate the foreign key. This optimization was previously applied in the
RI trigger function, but it is more efficient to avoid firing the trigger
altogether. Per recent discussion on pgsql-hackers.
Also add a regression test for some unintuitive foreign key behavior, and
refactor some code that deals with the OIDs of the various RI trigger
functions.
keys, rather than a single trigger for both events. This should not change
functionality, but it is more consistent: previously, there were trigger
functions for both "check_insert" and "check_update", but the former was
used for both events.
Bump catalog version number (not strictly necessary, but best to be
cautious).
would be evaluated only once anyway (ie, it's just a SELECT with no
FROM or an INSERT ... VALUES). The planner can't do it any faster than
the executor, so no point in an extra copying of the expression tree.
pg_class_aclmask(). We only need to do this when we have to check
pg_shadow.usecatupd, and that's not relevant unless the target table
is a system catalog. So we can usually avoid one syscache lookup.
are now reported via elog, eliminating the need to test the result code
at most call sites. Make it possible for the caller to distinguish a
freshly acquired lock from one already held in the current transaction.
Use that capability to avoid redundant AcceptInvalidationMessages() calls
in LockRelation().
status of the most recently queried userid. Since the common pattern is
many successive queries about the same user (ie, the current user) this
can save a lot of syscache probes.