plpgsql support to come later. Along the way, convert execMain's
SELECT INTO support into a DestReceiver, in order to eliminate some ugly
special cases.
Jonah Harris and Tom Lane
The main reason for refactoring was that set_config_option() was too
overloaded function and its behavior did not consistent. Old version of
set_config_function hides some messages. For example if you type:
tcp_port = 5432.1
then old implementation ignore this error without any message to log
file in the signal context (configuration reload). Main problem was that
semantic analysis of postgresql.conf is not perform in the
ProcessConfigFile function, but in the set_config_options *after*
context check. This skipped check for variables with PG_POSTMASTER
context. There was request from Joachim Wieland to add more messages
about ignored changes in the config file as well.
Zdenek Kotala
loaded libraries: call functions _PG_init() and _PG_fini() if the library
defines such symbols. Hence we no longer need to specify an initialization
function in preload_libraries: we can assume that the library used the
_PG_init() convention, instead. This removes one source of pilot error
in use of preloaded libraries. Original patch by Ralf Engelschall,
preload_libraries changes by me.
o print user name for all
o print portal name if defined for all
o print query for all
o reduce log_statement header to single keyword
o print bind parameters as DETAIL if text mode
that's shorter-lived than the expression state being evaluated in it really
doesn't work :-( --- we end up with fn_extra caches getting deleted while
still in use. Rather than abandon the notion of caching expression state
across domain_in calls altogether, I chose to make domain_in a bit cozier
with ExprContext. All we really need for evaluating variable-free
expressions is an ExprContext, not an EState, so I invented the notion of a
"standalone" ExprContext. domain_in can prevent resource leakages by doing
a ReScanExprContext on this rather than having to free it entirely; so we
can make the ExprContext have the same lifespan (and particularly the same
per_query memory context) as the expression state structs.
(e.g. "INSERT ... VALUES (...), (...), ...") and elsewhere as allowed
by the spec. (e.g. similar to a FROM clause subselect). initdb required.
Joe Conway and Tom Lane.
(table or index) before trying to open its relcache entry. This fixes
race conditions in which someone else commits a change to the relation's
catalog entries while we are in process of doing relcache load. Problems
of that ilk have been reported sporadically for years, but it was not
really practical to fix until recently --- for instance, the recent
addition of WAL-log support for in-place updates helped.
Along the way, remove pg_am.amconcurrent: all AMs are now expected to support
concurrent update.
it's handled just about like timezone; in particular, don't try
to read anything during InitializeGUCOptions. Should solve current
startup failure on Windows, and avoid wasted cycles if a nondefault
setting is specified in postgresql.conf too. Possibly we need to
think about a more general solution for handling 'expensive to set'
GUC options.
the float8 versions of the aggregates, which is all that the standard requires.
Sergey's original patch also provided versions using numeric arithmetic,
but given the size and slowness of the code, I doubt we ought to include
those in core.
the opportunity to treat COUNT(*) as a zero-argument aggregate instead
of the old hack that equated it to COUNT(1); this is materially cleaner
(no more weird ANYOID cases) and ought to be at least a tiny bit faster.
Original patch by Sergey Koposov; review, documentation, simple regression
tests, pg_dump and psql support by moi.
not "unset". An "unset" state doesn't really exist; all variables behave
like an empty string value if the string being pointed to has not been
initialized.
configuration files that can be altered by a DBA. The australian_timezones
GUC setting disappears, replaced by a timezone_abbreviations setting (set this
to 'Australia' to get the effect of australian_timezones). The list of zone
names defined by default has undergone a bit of cleanup, too. Documentation
still needs some work --- in particular, should we fix Table B-4, or just get
rid of it? Joachim Wieland, with some editorializing by moi.
thinking that indexes of different sizes are equally attractive. Per
gripe from Jim Nasby. (I remain unconvinced that there's such a problem
in existing releases, but CVS HEAD definitely has got a problem because
of its new count-only-leaf-pages approach to indexscan costing.)
to the low-order bits of the entry hash value. Also make some incidental
cleanups in the dynahash API, such as not exporting the hash header
structs to the world.
opclass. This is not so much because anyone's likely to create an index
on TID, as that sorting TIDs can be useful. Also added max and min
aggregates while at it, so that one can investigate the clusteredness of
a table with queries like SELECT min(ctid), max(ctid) FROM tab WHERE ...
Greg Stark and Tom Lane
To this end, add a couple of columns to pg_class, relminxid and relvacuumxid,
based on which we calculate the pg_database columns after each vacuum.
We now force all databases to be vacuumed, even template ones. A backend
noticing too old a database (meaning pg_database.datminxid is in danger of
falling behind Xid wraparound) will signal the postmaster, which in turn will
start an autovacuum iteration to process the offending database. In principle
this is only there to cope with frozen (non-connectable) databases without
forcing users to set them to connectable, but it could force regular user
database to go through a database-wide vacuum at any time. Maybe we should
warn users about this somehow. Of course the real solution will be to use
autovacuum all the time ;-)
There are some additional improvements we could have in this area: for example
the vacuum code could be smarter about not updating pg_database for each table
when called by autovacuum, and do it only once the whole autovacuum iteration
is done.
I updated the system catalogs documentation, but I didn't modify the
maintenance section. Also having some regression tests for this would be nice
but it's not really a very straightforward thing to do.
Catalog version bumped due to system catalog changes.
discussion (including making def_arg allow reserved words), add missed
opt_definition for UNIQUE case. Put the reloptions support code in a less
random place (I chose to make a new file access/common/reloptions.c).
Eliminate header inclusion creep. Make the index options functions safely
user-callable (seems like client apps might like to be able to test validity
of options before trying to make an index). Reduce overhead for normal case
with no options by allowing rd_options to be NULL. Fix some unmaintainably
klugy code, including getting rid of Natts_pg_class_fixed at long last.
Some stylistic cleanup too, and pay attention to keeping comments in sync
with code.
Documentation still needs work, though I did fix the omissions in
catalogs.sgml and indexam.sgml.
ScalarArrayOpExpr index quals: we were estimating the right total
number of rows returned, but treating the index-access part of the
cost as if a single scan were fetching that many consecutive index
tuples. Actually we should treat it as a multiple indexscan, and
if there are enough of 'em the Mackert-Lohman discount should kick in.
per-tuple space overhead for sorts in memory. I chose to replace the
previous patch that tried to write out the bare minimum amount of data
when sorting on disk; instead, just dump the MinimalTuples as-is. This
wastes 3 to 10 bytes per tuple depending on architecture and null-bitmap
length, but the simplification in the writetup/readtup routines seems
worth it.
tuples with less header overhead than a regular HeapTuple, per my
recent proposal. Teach TupleTableSlot code how to deal with these.
As proof of concept, change tuplestore.c to store MinimalTuples instead
of HeapTuples. Future patches will expand the concept to other places
where it is useful.
palloc() will normally round allocation requests up to the next power of 2,
so make dynahash choose allocation sizes that are as close to a power of 2
as possible.
Back-patch to 8.1 --- the problem exists further back, but a much larger
patch would be needed and it doesn't seem worth taking any risks.
changing semantics too much. statement_timestamp is now set immediately
upon receipt of a client command message, and the various places that used
to do their own gettimeofday() calls to mark command startup are referenced
to that instead. I have also made stats_command_string use that same
value for pg_stat_activity.query_start for both the command itself and
its eventual replacement by <IDLE> or <idle in transaction>. There was
some debate about that, but no argument that seemed convincing enough to
justify an extra gettimeofday() call.
libpq/md5.h, so that there's a clear separation between backend-only
definitions and shared frontend/backend definitions. (Turns out this
is reversing a bad decision from some years ago...) Fix up references
to crypt.h as needed. I looked into moving the code into src/port, but
the headers in src/include/libpq are sufficiently intertwined that it
seems more work than it's worth to do that.
current commands; instead, store current-status information in shared
memory. This substantially reduces the overhead of stats_command_string
and also ensures that pg_stat_activity is fully up to date at all times.
Per my recent proposal.
by creating a reference-count mechanism, similar to what we did a long time
ago for catcache entries. The back branches have an ugly solution involving
lots of extra copies, but this way is more efficient. Reference counting is
only applied to tupdescs that are actually in caches --- there seems no need
to use it for tupdescs that are generated in the executor, since they'll go
away during plan shutdown by virtue of being in the per-query memory context.
Neil Conway and Tom Lane
remove the infrastructure needed to enforce the limit, ie, the global
LRU list of cache entries. On small-to-middling databases this wins
because maintaining the LRU list is a waste of time. On large databases
this wins because it's better to keep more cache entries (we assume
such users can afford to use some more per-backend memory than was
contemplated in the Berkeley-era catcache design). This provides a
noticeable improvement in the speed of psql \d on a 10000-table
database, though it doesn't make it instantaneous.
While at it, use per-catcache settings for the number of hash buckets
per catcache, rather than the former one-size-fits-all value. It's a
bit silly to be using the same number of hash buckets for, eg, pg_am
and pg_attribute. The specific values I used might need some tuning,
but they seem to be in the right ballpark based on CATCACHE_STATS
results from the standard regression tests.
o remove many WIN32_CLIENT_ONLY defines
o add WIN32_ONLY_COMPILER define
o add 3rd argument to open() for portability
o add include/port/win32_msvc directory for
system includes
Magnus Hagander
that the Mackert-Lohmann formula applies across all the repetitions of the
nestloop, not just each scan independently. We use the M-L formula to
estimate the number of pages fetched from the index as well as from the table;
that isn't what it was designed for, but it seems reasonably applicable
anyway. This makes large numbers of repetitions look much cheaper than
before, which accords with many reports we've received of overestimation
of the cost of a nestloop. Also, change the index access cost model to
charge random_page_cost per index leaf page touched, while explicitly
not counting anything for access to metapage or upper tree pages. This
may all need tweaking after we get some field experience, but in simple
tests it seems to be giving saner results than before. The main thing
is to get the infrastructure in place to let cost_index() and amcostestimate
functions take repeated scans into account at all. Per my recent proposal.
Note: this patch changes pg_proc.h, but I did not force initdb because
the changes are basically cosmetic --- the system does not look into
pg_proc to decide how to call an index amcostestimate function, and
there's no way to call such a function from SQL at all.
This shouldn't affect simple indexscans much, while for bitmap scans that
are touching a lot of index rows, this seems to bring the estimates more
in line with reality. Per recent discussion.
assumed that a sequential page fetch has cost 1.0. This patch doesn't
in itself change the system's behavior at all, but it opens the door to
people adopting other units of measurement for EXPLAIN costs. Also, if
we ever decide it's worth inventing per-tablespace access cost settings,
this change provides a workable intellectual framework for that.
for LC_MESSAGES; instead, just press forward, leaving the effective setting
at 'C'. There is not any very good reason to complain when we are going
to replace the value soon with whatever postgresql.conf says. This change
should solve the occasionally-reported problem of initdb failing with
'failed to initialize lc_messages'; the current theory is that that is
a reflection of either wrong LANG/LC_MESSAGES or completely broken locale
support.
as this seems only likely to create headaches for module developers. Put
the macro in the pre-existing fmgr.h file instead. Avoid being too cute
about how many fields we can cram into a word, and avoid trying to fetch
from a library we've already unlinked.
Along the way, it occurred to me that the magic block really ought to be
'const' so it can be stored in the program text area. Do the same for
the existing data blocks for PG_FUNCTION_INFO_V1 functions.
It now only checks four things:
Major version number (7.4 or 8.1 for example)
NAMEDATALEN
FUNC_MAX_ARGS
INDEX_MAX_KEYS
The three constants were chosen because:
1. We document them in the config page in the docs
2. We mark them as changable in pg_config_manual.h
3. Changing any of these will break some of the more popular modules:
FUNC_MAX_ARGS changes fmgr interface, every module uses this NAMEDATALEN
changes syscache interface, every PL as well as tsearch uses this
INDEX_MAX_KEYS breaks tsearch and anything using GiST.
Martijn van Oosterhout
and standard_conforming_strings; likewise for the other client programs
that need it. As per previous discussion, a pg_dump dump now conforms
to the standard_conforming_strings setting of the source database.
We don't use E'' syntax in the dump, thereby improving portability of
the SQL. I added a SET escape_strings_warning = off command to keep
the dumps from getting a lot of back-chatter from that.
'off'. This allows pg_dump output with standard_conforming_strings =
'on' to generate proper strings that can be loaded into other databases
without the backslash doubling we typically do. I have added the
dumping of the standard_conforming_strings value to pg_dump.
I also added standard backslash handling for plpgsql.
and transaction visibility fields of tuples being sorted. These are
always uninteresting in a tuple being sorted (if the fields were actually
selected, they'd have been pulled out into user columns beforehand).
This saves about 24 bytes per row being sorted, which is a useful savings
for any but the widest of sort rows. Per recent discussion.
parser will allow "\'" to be used to represent a literal quote mark. The
"\'" representation has been deprecated for some time in favor of the
SQL-standard representation "''" (two single quote marks), but it has been
used often enough that just disallowing it immediately won't do. Hence
backslash_quote allows the settings "on", "off", and "safe_encoding",
the last meaning to allow "\'" only if client_encoding is a valid server
encoding. That is now the default, and the reason is that in encodings
such as SJIS that allow 0x5c (ASCII backslash) to be the last byte of a
multibyte character, accepting "\'" allows SQL-injection attacks as per
CVE-2006-2314 (further details will be published after release). The
"on" setting is available for backward compatibility, but it must not be
used with clients that are exposed to untrusted input.
Thanks to Akio Ishida and Yasuo Ohgaki for identifying this security issue.
characters in all cases. Formerly we mostly just threw warnings for invalid
input, and failed to detect it at all if no encoding conversion was required.
The tighter check is needed to defend against SQL-injection attacks as per
CVE-2006-2313 (further details will be published after release). Embedded
zero (null) bytes will be rejected as well. The checks are applied during
input to the backend (receipt from client or COPY IN), so it no longer seems
necessary to check in textin() and related routines; any string arriving at
those functions will already have been validated. Conversion failure
reporting (for characters with no equivalent in the destination encoding)
has been cleaned up and made consistent while at it.
Also, fix a few longstanding errors in little-used encoding conversion
routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic,
mic_to_euc_tw were all broken to varying extents.
Patches by Tatsuo Ishii and Tom Lane. Thanks to Akio Ishida and Yasuo Ohgaki
for identifying the security issues.
issued by autovacuum. Add accessor functions to them, and use those in the
pg_stat_*_tables system views.
Catalog version bumped due to changes in the pgstat views and the pgstat file.
Patch from Larry Rosenman, minor improvements by me.
throw warnings for 100%-SQL-standard constructs, clean up some minor
infelicities, try to un-break ecpg to the best of my ability. (It's not clear
how ecpg is going to find out the setting of standard_conforming_strings,
though.) I think pg_dump still needs work, too.
it's not necessary to have three separate calls anymore. This patch also
fixes things so we don't try to read pg_internal.init until after we've
obtained lock on the target database; which was fairly harmless, but it's
certainly cleaner this way.
The former approach used ExclusiveLock on pg_database, which being a
cluster-wide lock meant only one of these operations could proceed at
a time; worse, it also blocked all incoming connections in ReverifyMyDatabase.
Now that we have LockSharedObject(), we can use locks of different types
applied to databases considered as objects. This allows much more
flexible management of the interlocking: two CREATE DATABASEs need not
block each other, and need not block connections except to the template
database being used. Similarly DROP DATABASE doesn't block unrelated
operations. The locking used in flatfiles.c is also much narrower in
scope than before. Per recent proposal.
in various places that were previously doing ad hoc pg_database searches.
This may speed up database-related privilege checks a little bit, but
the main motivation is to eliminate the performance reason for having
ReverifyMyDatabase do such a lot of stuff (viz, avoiding repeat scans
of pg_database during backend startup). The locking reason for having
that routine is about to go away, and it'd be good to have the option
to break it up.
the union of its child relations as well. This might have been a good idea
when it was originally coded, but it's a fatally bad idea when inheritance is
being used for partitioning. It's better to have no stats at all than
completely misleading stats. Per report from Mark Liberman.
The bug arguably exists all the way back, but I've only patched HEAD and 8.1
because we weren't particularly trying to support partitioning before 8.1.
Eventually we ought to look at deriving union statistics instead of just
punting, but for now the drop kick looks good.
input datatypes given, and use this before trying OpernameGetCandidates.
This is faster than the old method when there's an exact match, and it
does not seem materially slower when there's not. And it definitely
makes some of the callers cleaner, because they didn't really want to
know about a list of candidates anyway. Per discussion with Atsushi Ogawa.
support both FOR UPDATE and FOR SHARE in one command, as well as both
NOWAIT and normal WAIT behavior. The more general code is actually
simpler and cleaner.
cases. This was not needed in the existing uses within selfuncs.c, but if
we're gonna export it for general use, the extra generality seems helpful.
Motivated by looking at ltree example.
thereby saving a visit to the metapage in most index searches/updates.
This wouldn't actually save any I/O (since in the old regime the metapage
generally stayed in cache anyway), but it does provide a useful decrease
in bufmgr traffic in high-contention scenarios. Per my recent proposal.
transaction_timestamp() (just like now()).
Also update statement_timeout() to mention it is statement arrival time
that is measured.
Catalog version updated.
not named ones, and replace linear searches of the list with array indexing.
The named-parameter support has been dead code for many years anyway,
and recent profiling suggests that the searching was costing a noticeable
amount of performance for complex queries.
of rejecting palloc(0). Also, tweak like_selectivity() to avoid assuming
the presented pattern is nonempty; although that assumption is valid,
it doesn't really help much, and the new coding is more correct anyway
since it properly handles redundant wildcards. In combination these
changes should eliminate a Coverity warning noted by Martijn.
alternatives ("|" symbol). The original coding allowed the added ^ and $
constraints to be absorbed into the first and last alternatives, producing
a pattern that would match more than it should. Per report from Eric Noriega.
I also changed the pattern to add an ARE director ("***:"), ensuring that
SIMILAR TO patterns do not change behavior if regex_flavor is changed. This
is necessary to make the non-capturing parentheses work, and seems like a
good idea on general principles.
Back-patched as far as 7.4. 7.3 also has the bug, but a fix seems impractical
because that version's regex engine doesn't have non-capturing parens.
when trying to locate the referent of a RECORD variable. This fixes the
'record type has not been registered' failure reported by Stefan
Kaltenbrunner about a month ago. A side effect of the way I chose to
fix it is that most variable references in join conditions will now be
properly labeled with the variable's source table name, instead of the
not-too-helpful 'outer' or 'inner' we used to use.
that apply the necessary domain constraint checks immediately. This fixes
cases where domain constraints went unchecked for statement parameters,
PL function local variables and results, etc. We can also eliminate existing
special cases for domains in places that had gotten it right, eg COPY.
Also, allow domains over domains (base of a domain is another domain type).
This almost worked before, but was disallowed because the original patch
hadn't gotten it quite right.
functions are not strict, they will be called (passing a NULL first parameter)
during any attempt to input a NULL value of their datatype. Currently, all
our input functions are strict and so this commit does not change any
behavior. However, this will make it possible to build domain input functions
that centralize checking of domain constraints, thereby closing numerous holes
in our domain support, as per previous discussion.
While at it, I took the opportunity to introduce convenience functions
InputFunctionCall, OutputFunctionCall, etc to use in code that calls I/O
functions. This eliminates a lot of grotty-looking casts, but the main
motivation is to make it easier to grep for these places if we ever need
to touch them again.
This commit doesn't make much functional change, but it does eliminate some
duplicated code --- for instance, PageIsNew tests are now done inside
XLogReadBuffer rather than by each caller.
The GIST xlog code still needs a lot of love, but I'll worry about that
separately.
The original coding stored the raw parser output (ColumnDef and TypeName
nodes) which was ugly, bulky, and wrong because it failed to create any
dependency on the referenced datatype --- and in fact would not track type
renamings and suchlike. Instead store a list of column type OIDs in the
RTE.
Also fix up general failure of recordDependencyOnExpr to do anything sane
about recording dependencies on datatypes. While there are many cases where
there will be an indirect dependency (eg if an operator returns a datatype,
the dependency on the operator is enough), we do have to record the datatype
as a separate dependency in examples like CoerceToDomain.
initdb forced because of change of stored rules.
during parse analysis, not only errors detected in the flex/bison stages.
This is per my earlier proposal. This commit includes all the basic
infrastructure, but locations are only tracked and reported for errors
involving column references, function calls, and operators. More could
be done later but this seems like a good set to start with. I've also
moved the ReportSyntaxErrorPosition logic out of psql and into libpq,
which should make it available to more people --- even within psql this
is an improvement because warnings weren't handled by ReportSyntaxErrorPosition.
similar constants if they were not previously defined. All these
constants must be defined by limits.h according to C89, so we can
safely assume they are present.
case where we run low on array slots before we run low on memory is much
more probable than I had thought, and so it's important to treat each
tape fairly in that case. To fix this, track per-tape slot allocations
just like we track per-tape space allocation. Also, in the FINALMERGE
code path avoid scanning all the input tapes when we really only need to
read from one. This should fix poor behavior with very large work_mem
as exhibited by Stefan Kaltenbrunner.
I didn't do anything about putting an upper bound on the number of tapes,
but maybe we should still consider that.
var_samp(), stddev_pop(), and stddev_samp(). var_samp() and stddev_samp()
are just renamings of the historical Postgres aggregates variance() and
stddev() -- the latter names have been kept for backward compatibility.
This patch includes updates for the documentation and regression tests.
The catversion has been bumped.
NB: SQL2003 requires that DISTINCT not be specified for any of these
aggregates. Per discussion on -patches, I have NOT implemented this
restriction: if the user asks for stddev(DISTINCT x), presumably they
know what they are doing.
performance issue during regular merge passes not only the 'final merge'
case. The original design contemplated that there would never be more
than about one free block per 'tape', hence no need for an efficient
method of keeping the free blocks sorted. But given the later addition
of merge preread behavior in tuplesort.c, there is likely to be about
work_mem worth of free blocks, which is not so small ... and for that
matter the number of tapes isn't necessarily small anymore either. So
we'd better get rid of the assumption entirely. Instead, I'm assuming
that the usage pattern will involve alternation between merge preread
and writing of a new run. This makes it reasonable to just add blocks
to the list without sorting during successive ltsReleaseBlock calls,
and then do a qsort() when we start getting ltsGetFreeBlock() calls.
Experimentation seems to confirm that there aren't many qsort calls
relative to the number of ltsReleaseBlock/ltsGetFreeBlock calls.
we are doing the final merge pass on-the-fly, and not writing the data
back onto a 'tape', the number of free blocks in the tape set will become
large, leading to a lot of time wasted in ltsReleaseBlock(). There is
really no need to track the free blocks anymore in this state, so add a
simple shutoff switch. Per report from Stefan Kaltenbrunner.
(respectively) to rename yylex and related symbols. Some were doing
it this way already, while others used not-too-reliable sed hacks in
the Makefiles. It's all nice and consistent now.
not likely ever to be implemented seeing it's been removed from SQL2003.
This allows getting rid of the 'filter' version of yylex() that we had in
parser.c, which should save at least a few microseconds in parsing.
- new function justify_interval(interval)
- modified function justify_hours(interval)
- modified function justify_days(interval)
These functions are defined to meet the requirements as discussed in
this thread. Specifically:
- justify_hours makes certain the sign bit on the hours
matches the sign bit on the days. It only checks the
sign bit on the days, and not the months, when
determining if the hours should be positive or negative.
After the call, -24 < hours < 24.
- justify_days makes certain the sign bit on the days
matches the sign bit on the months. It's behavior does
not depend on the hours, nor does it modify the hours.
After the call, -30 < days < 30.
- justify_interval makes sure the sign bits on all three
fields months, days, and hours are all the same. After
the call, -24 < hours < 24 AND -30 < days < 30.
Mark Dilger
In particular, ensure that enlargement of the memtuples[] array doesn't
fall foul of MaxAllocSize when work_mem is very large, and don't bother
enlarging it if that would force an immediate switch into 'tape' mode anyway.
we'll go over to disk-based sort if we reach that limit.
This fixes Stefan Kaltenbrunner's observation that sorting can suffer an
'invalid memory alloc request size' failure when sort_mem is set large
enough. It's unfortunately not so easy to fix in 8.1 ...
are unnecessarily allocated on the heap rather than the stack. If the
StringInfo doesn't outlive the stack frame in which it is created,
there is no need to allocate it on the heap via makeStringInfo() --
stack allocation is faster. While it's not a big deal unless the
code is in a critical path, I don't see a reason not to save a few
cycles -- using stack allocation is not less readable.
I also cleaned up a bit of code along the way: moved variable
declarations into a more tightly-enclosing scope where possible,
fixed some pointless copying of strings in dblink, etc.
creation of a shell type. This allows a less hacky way of dealing with
the mutual dependency between a datatype and its I/O functions: make a
shell type, then make the functions, then define the datatype fully.
We should fix pg_dump to handle things this way, but this commit just deals
with the backend.
Martijn van Oosterhout, with some corrections by Tom Lane.
each tuple, as per my proposal of several days ago. Also, clean up
sort memory management by keeping all working data in a separate memory
context, and refine the handling of low-memory conditions.
the script is not executable as UCS_to_most.pl is in CVS. It also won't
pick up any custom setting of the perl version/location to use. This
patch calls perl scripts like $(PERL) $(srcdir)/script.pl.
Kris Jurka
allocates the control data. The per-tape buffers are allocated only
on first use. This saves memory in situations where tuplesort.c
overestimates the number of tapes needed (ie, there are fewer runs
than tapes). Also, this makes legitimate the coding in inittapes()
that includes tape buffer space in the maximum-memory calculation:
when inittapes runs, we've already expended the whole allowed memory
on tuple storage, and so we'd better not allocate all the tape buffers
until we've flushed some tuples out of memory.
with fixed merge order (fixed number of "tapes") was based on obsolete
assumptions, namely that tape drives are expensive. Since our "tapes"
are really just a couple of buffers, we can have a lot of them given
adequate workspace. This allows reduction of the number of merge passes
with consequent savings of I/O during large sorts.
Simon Riggs with some rework by Tom Lane
up a bunch of the support utilities.
In src/backend/utils/mb/Unicode remove nearly duplicate copies of the
UCS_to_XXX perl script and replace with one version to handle all generic
files. Update the Makefile so that it knows about all the map files.
This produces a slight difference in some of the map files, using a
uniform naming convention and not mapping the null character.
In src/backend/utils/mb/conversion_procs create a master utf8<->win
codepage function like the ISO 8859 versions instead of having a separate
handler for each conversion.
There is an externally visible change in the name of the win1258 to utf8
conversion. According to the documentation notes, it was named
incorrectly and this changes it to a standard name.
Running the Unicode mapping perl scripts has shown some additional mapping
changes in koi8r and iso8859-7.
id (CVE-2006-0553). Also fix related bug in SET SESSION AUTHORIZATION that
allows unprivileged users to crash the server, if it has been compiled with
Asserts enabled. The escalation-of-privilege risk exists only in 8.1.0-8.1.2.
However, the Assert-crash risk exists in all releases back to 7.3.
Thanks to Akio Ishida for reporting this problem.
regardless of the current schema search path. Since CREATE OPERATOR CLASS
only allows one default opclass per datatype regardless of schemas, this
should have minimal impact, and it fixes problems with failure to find a
desired opclass while restoring dump files. Per discussion at
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00284.php.
Remove now-redundant-or-unused code in typcache.c and namespace.c,
and backpatch as far as 8.0.
If the second output column value is 'a\nb', the 'b' should appear
in the second display column, rather than the first column as it
does now.
Change libpq's PQdsplen() to return more useful values.
> Note: this changes the PQdsplen function, it can now return zero or
> minus one which was not possible before. It doesn't appear anyone is
> actually using the functions other than psql but it is a change. The
> functions are not actually documentated anywhere so it's not like we're
> breaking a defined interface. The new semantics follow the Unicode
> standard.
BACKWARD COMPATIBLE CHANGE.
The only user-visible change I saw in the regression tests is that a
SELECT * on a table where all the columns have been dropped doesn't
return a blank line like before. This seems like a step forward.
Martijn van Oosterhout
the format on Tuple(Numeric) and the format to calculate(NumericVar)
are different. I understood that to reduce I/O. However, when many
comparisons or calculations of NUMERIC are executed, the conversion
of Numeric and NumericVar becomes a bottleneck.
It is profile result when "create index on NUMERIC column" is executed:
% cumulative self self total
time seconds seconds calls s/call s/call name
17.61 10.27 10.27 34542006 0.00 0.00 cmp_numerics
11.90 17.21 6.94 34542006 0.00 0.00 comparetup_index
7.42 21.54 4.33 71102587 0.00 0.00 AllocSetAlloc
7.02 25.64 4.09 69084012 0.00 0.00 set_var_from_num
4.87 28.48 2.84 69084012 0.00 0.00 alloc_var
4.79 31.27 2.79 142205745 0.00 0.00 AllocSetFreeIndex
4.55 33.92 2.65 34542004 0.00 0.00 cmp_abs
4.07 36.30 2.38 71101189 0.00 0.00 AllocSetFree
3.83 38.53 2.23 69084012 0.00 0.00 free_var
The create index command executes many comparisons of Numeric values.
Functions other than comparetup_index spent a lot of cycles for
conversion from Numeric to NumericVar.
An attached patch enables the comparison of Numeric values without
executing conversion to NumericVar. The execution time of that SQL
becomes half.
o Test SQL (index_test table has 1,000,000 tuples)
create index index_test_idx on index_test(num_col);
o Test results (executed the test five times)
(1)PentiumIII
original: 39.789s 36.823s 36.737s 37.752s 37.019s
patched : 18.560s 19.103s 18.830s 18.408s 18.853s
4.07 36.30 2.38 71101189 0.00 0.00 AllocSetFree
3.83 38.53 2.23 69084012 0.00 0.00 free_var
The create index command executes many comparisons of Numeric values.
Functions other than comparetup_index spent a lot of cycles for
conversion from Numeric to NumericVar.
An attached patch enables the comparison of Numeric values without
executing conversion to NumericVar. The execution time of that SQL
becomes half.
o Test SQL (index_test table has 1,000,000 tuples)
create index index_test_idx on index_test(num_col);
o Test results (executed the test five times)
(1)PentiumIII
original: 39.789s 36.823s 36.737s 37.752s 37.019s
patched : 18.560s 19.103s 18.830s 18.408s 18.853s
(2)Pentium4
original: 16.349s 14.997s 12.979s 13.169s 12.955s
patched : 7.005s 6.594s 6.770s 6.740s 6.828s
(3)Itanium2
original: 15.392s 15.447s 15.350s 15.370s 15.417s
patched : 7.413s 7.330s 7.334s 7.339s 7.339s
(4)Ultra Sparc
original: 64.435s 59.336s 59.332s 58.455s 59.781s
patched : 28.630s 28.666s 28.983s 28.744s 28.595s
Atsushi Ogawa
While we normally prefer the notation "foo.*" for a whole-row Var, that does
not work at SELECT top level, because in that context the parser will assume
that what is wanted is to expand the "*" into a list of separate target
columns, yielding behavior different from a whole-row Var. We have to emit
just "foo" instead in that context. Per report from Sokolov Yura.
and rely exclusively on the SQL type system to tell the difference between
the types. Prevent creation of invalid CIDR values via casting from INET
or set_masklen() --- both of these operations now silently zero any bits
to the right of the netmask. Remove duplicate CIDR comparison operators,
letting the type rely on the INET operators instead.
Continue to support GRANT ON [TABLE] for sequences for backward
compatibility; issue warning for invalid sequence permissions.
[Backward compatibility warning message.]
Add USAGE permission for sequences that allows only currval() and
nextval(), not setval().
Mention object name in grant/revoke warnings because of possible
multi-object operations.
index's support-function cache (in index_getprocinfo). Since none of that
data can change for an index that's in active use, it seems sufficient to
treat all open indexes the same way we were treating "nailed" system indexes
--- that is, just re-read the pg_class row and leave the rest of the relcache
entry strictly alone. The pg_class re-read might not be strictly necessary
either, but since the reltablespace and relfilenode can change in normal
operation it seems safest to do it. (We don't support changing any of the
other info about an index at all, at the moment.)
Back-patch as far as 8.0. It might be possible to adapt the patch to 7.4,
but it would take more work than I care to expend for such a low-probability
problem. 7.3 is out of luck for sure.
This is utterly insignificant in normal operation, but it becomes a
problem during cache inval stress testing. The original coding in fact
had no leak --- the 8.0 List rewrite created the issue. I wonder whether
list_concat should pfree the discarded header?
cursors. Patch from Joachim Wieland, review and ediorialization by Neil
Conway. The view lists cursors defined by DECLARE CURSOR, using SPI, or
via the Bind message of the frontend/backend protocol. This means the
view does not list the unnamed portal or the portal created to implement
EXECUTE. Because we do list SPI portals, there might be more rows in
this view than you might expect if you are using SPI implicitly (e.g.
via a procedural language).
Per recent discussion on -hackers, the query string included in the
view for cursors defined by DECLARE CURSOR is based on
debug_query_string. That means it is not accurate if multiple queries
separated by semicolons are submitted as one query string. However,
there doesn't seem a trivial fix for that: debug_query_string
is better than nothing. I also changed SPI_cursor_open() to include
the source text for the portal it creates: AFAICS there is no reason
not to do this.
Update the documentation and regression tests, bump the catversion.
fmgr_info(), in the TopMemoryContext. I couldn't see that the code
actually leaked, but in general I think it's fragile to assume that
pfree'ing an FmgrInfo along with its fn_extra field is enough to
reclaim all the resources allocated by fmgr_info(). I changed the
code to do its allocations in a new child context of
TopMemoryContext, MbProcContext. When we want to release the
allocations we can just reset the context, which is cleaner.
rather than "return expr;" -- the latter style is used in most of the
tree. I kept the parentheses when they were necessary or useful because
the return expression was complex.
listed in the column's most-common-values statistics entry. This gives
us an exact selectivity result for the portion of the column population
represented by the MCV list, which can be a big leg up in accuracy if
that's a large fraction of the population. The heuristics involving
pattern contents and prefix are applied only to the part of the population
not included in the MCV list.
and nail a couple more system indexes into cache. This doesn't make
any difference in normal system operation, but when forcing constant
cache resets it's difficult to get through the rules regression test
without these changes.
dead and have become unreferenced. Before 8.1, such members were left
for AtEOXact_CatCache() to clean up, but now AtEOXact_CatCache isn't
supposed to have anything to do. In an assert-enabled build this bug
leads to an assertion failure at transaction end, but in a non-assert
build the dead member is effectively just a small memory leak.
Per report from Jeremy Drake.
an LWLock instead of a spinlock. This hardly matters on Unix machines
but should improve startup performance on Windows (or any port using
EXEC_BACKEND). Per previous discussion.
from Andrus Moor. The former state-machine-style coding wasn't actually
doing much except obscuring the control flow, and it didn't extend
readily to fix this case, so I just took it out. Also, add a
YY_FLUSH_BUFFER call to ensure the lexer is reset correctly if the
previous scan failed partway through the file.
selection of a field from the result of a function returning RECORD.
I believe this case is new in 8.1; it's due to the addition of OUT parameters.
Per example from Michael Fuhr.
setup. This protects against undesired changes in locale behavior
if someone carelessly does setlocale(LC_ALL, "") (and we know who
you are, perl guys).
get_func_arg_info() for consistency with other names there.
This code will probably be useful to other PLs when they start to
support OUT parameters, so better to have it in the main backend.
Also, fix plpgsql validator to detect bogus OUT parameters even when
check_function_bodies is off.
(previously we only did = and <> correctly). Also, allow row comparisons
with any operators that are in btree opclasses, not only those with these
specific names. This gets rid of a whole lot of indefensible assumptions
about the behavior of particular operators based on their names ... though
it's still true that IN and NOT IN expand to "= ANY". The patch adds a
RowCompareExpr expression node type, and makes some changes in the
representation of ANY/ALL/ROWCOMPARE SubLinks so that they can share code
with RowCompareExpr.
I have not yet done anything about making RowCompareExpr an indexable
operator, but will look at that soon.
initdb forced due to changes in stored rules.
#define HIGHBIT (0x80)
#define IS_HIGHBIT_SET(ch) ((unsigned char)(ch) & HIGHBIT)
and removed CSIGNBIT and mapped it uses to HIGHBIT. I have also added
uses for IS_HIGHBIT_SET where appropriate. This change is
purely for code clarity.
See:
Subject: [HACKERS] bugs with certain Asian multibyte charsets
From: Tatsuo Ishii <ishii@sraoss.co.jp>
To: pgsql-hackers@postgresql.org
Date: Sat, 24 Dec 2005 18:25:33 +0900 (JST)
for more details/
Also make the code more robust by searching for target encoding
in the internal charset map.
Problem reported by Sagi Bashari on 2005/12/21.
See "[BUGS] BUG #2120: Crash when doing UTF8<->ISO_8859_8 encoding conversion"
on pgsql-bugs list for more details.
equal: if strcoll claims two strings are equal, check it with strcmp, and
sort according to strcmp if not identical. This fixes inconsistent
behavior under glibc's hu_HU locale, and probably under some other locales
as well. Also, take advantage of the now-well-defined behavior to speed up
texteq, textne, bpchareq, bpcharne: they may as well just do a bitwise
comparison and not bother with strcoll at all.
NOTE: affected databases may need to REINDEX indexes on text columns to be
sure they are self-consistent.
Per my recent proposal. I ended up basing the implementation on the
existing mechanism for enforcing valid join orders of IN joins --- the
rules for valid outer-join orders are somewhat similar.
that simplify_boolean_equality() may leave behind. This is only relevant
if the user writes something a bit silly, like CASE x=y WHEN TRUE THEN.
Per example from Michael Fuhr; may or may not explain bug #2106.
the data defining the semantics of a lock method (ie, conflict resolution
table and ancillary data, which is all constant) and the hash tables
storing the current state. The only thing we give up by this is the
ability to use separate hashtables for different lock methods, but there
is no need for that anyway. Put some extra fields into the LockMethod
definition structs to clean up some other uglinesses, like hard-wired
tests for DEFAULT_LOCKMETHOD and USER_LOCKMETHOD. This commit doesn't
do anything about the performance issues we were discussing, but it clears
away some of the underbrush that's in the way of fixing that.
Map them to a single day, so '30 hours' is 'AM'.
Have to_char(interval) and to_char(time) use "HH", "HH12" as 12-hour
intervals, rather than bypass and print the full interval hours. This
is neeeded because to_char(time) is mapped to interval in this function.
Intervals should use "HH24", and document suggestion.
Allow "D" format specifiers for interval/time.
qualification when the underlying operator is indexable and useOr is true.
That is, indexkey op ANY (ARRAY[...]) is effectively translated into an
OR combination of one indexscan for each array element. This only works
for bitmap index scans, of course, since regular indexscans no longer
support OR'ing of scans. There are still some loose ends to clean up
before changing 'x IN (list)' to translate as a ScalarArrayOpExpr;
for instance predtest.c ought to be taught about it. But this gets the
basic functionality in place.
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
the array (for array_push) or higher-dimensional array (for array_cat)
rather than decrementing it as before. This avoids generating lower
bounds other than one for any array operation within the SQL spec. Per
recent discussion.
Interestingly, this seems to have been the original behavior, because
while updating the docs I noticed that a large fraction of relevant
examples were *wrong* for the old behavior and are now right. Is it
worth correcting this in the back-branch docs?
functionality, but I still need to make another pass looking at places
that incidentally use arrays (such as ACL manipulation) to make sure they
are null-safe. Contrib needs work too.
I have not changed the behaviors that are still under discussion about
array comparison and what to do with lower bounds.
to assume that the string pointer passed to set_ps_display is good forever.
There's no need to anyway since ps_status.c itself saves the string, and
we already had an API (get_ps_display) to return it.
I believe this explains Jim Nasby's report of intermittent crashes in
elog.c when %i format code is in use in log_line_prefix.
While at it, repair a previously unnoticed problem: on some platforms such as
Darwin, the string returned by get_ps_display was blank-padded to the maximum
length, meaning that lock.c's attempt to append " waiting" to it never worked.
create circularity of role memberships. This is a minimum-impact fix
for the problem reported by Florian Pflug. I thought about removing
the superuser_arg test from is_member_of_role() altogether, as it seems
redundant for many of the callers --- but not all, and it's way too late
in the 8.1 cycle to be making large changes. Perhaps reconsider this
later.
some small stylistic improvements in these functions. Also fix several
places where TMODULO() was being used with wrong-sized quotient argument,
creating a risk of overflow --- interval2tm was actually capable of going
into an infinite loop because of this.
since it can take a fair amount of time and this can confuse boot scripts
that expect postmaster.pid to appear quickly. Move initialization of SSL
library and preloaded libraries to after that point, too, just for luck.
Per reports from Tony Caduto and others.
fix problems with replacement-string backslashes that aren't followed by
one of the expected characters, avoid giving the impression that
replace_text_regexp() is meant to be called directly as a SQL function,
etc.
the facility has been set, the facility gets set to LOCAL0 and cannot
be changed later. This seems reasonably plausible to happen, particularly
at higher debug log levels, though I am not certain it explains Han Holl's
recent report. Easiest fix is to teach the code how to change the value
on-the-fly, which is nicer anyway. I made the settings PGC_SIGHUP to
conform with log_destination.