in Turkish locale. Keywords are now checked under pure ASCII case-folding
rules ('A'-'Z'->'a'-'z' and nothing else). However, once a word is
determined not to be a keyword, it will be case-folded under the current
locale, same as before. See pghackers discussion 20-Feb-01.
waste of cycles on single-CPU machines, and of dubious utility on multi-CPU
machines too.
Tweak s_lock_stuck so that caller can specify timeout interval, and
increase interval before declaring stuck spinlock for buffer locks and XLOG
locks.
On systems that have fdatasync(), use that rather than fsync() to sync WAL
log writes. Ensure that WAL file is entirely allocated during XLogFileInit.
either wrong or unnecessary in most cases, and on systems where setting
status takes a kernel call, the overhead of setting status three times
per command rather than two is annoying.
1. If there is exactly one pg_operator entry of the right name and oprkind,
oper() and related routines would return that entry whether its input type
had anything to do with the request or not. This is just premature
optimization: we shouldn't return the single candidate until after we verify
that it really is a valid candidate, ie, is at least coercion-compatible
with the given types.
2. oper() and related routines only promise a coercion-compatible result.
Unfortunately, there were quite a few callers that assumed the returned
operator is binary-compatible with the given datatype; they would proceed
to call it without making any datatype coercions. These callers include
sorting, grouping, aggregation, and VACUUM ANALYZE. In general I think
it is appropriate for these callers to require an exact or binary-compatible
match, so I've added a new routine compatible_oper() that only succeeds if
it can find an operator that doesn't require any run-time conversions.
Callers now call oper() or compatible_oper() depending on whether they are
prepared to deal with type conversion or not.
The upshot of these bugs is revealed by the following silliness in PL/Tcl's
selftest: it creates an operator @< on int4, and then tries to use it to
sort a char(N) column. The system would let it do that :-( (and evidently
has done so since 6.3 :-( :-(). The result in this case was just a silly
sort order, but the reverse combination would've provoked coredump from
trying to dereference integers. With this fix you get more reasonable
behavior:
pltcl_test=# select * from T_pkey1 order by key1, key2 using @<;
ERROR: Unable to identify an operator '@<' for types 'bpchar' and 'bpchar'
You will have to retype this query using an explicit cast
compressed storage works perfectly well. Might as well have a coherent
strategy for applying it, rather than the haphazard store-what-you-get
approach that was in the code before. The strategy I've set up here is
to attempt compression of any compressible index value exceeding
BLCKSZ/16, or about 500 bytes by default.
the old ones were not small enough to ensure r-tree and gist indexes would
get picked when available. These numbers are totally bogus anyway, but
in the absence of any real estimation technique, we'd like to select
indexes when available ...
such as
SELECT f1 FROM foo UNION SELECT ... ORDER BY upper(f1)
to draw
'ORDER BY on a UNION/INTERSECT/EXCEPT result must be on one of the result columns'
rather than the uninformative 'f1 not found' we were producing before.
Eventually this should actually work, but that looks much too hard to try
to implement in late beta...
clause with an alias is a <subquery> and therefore hides table references
appearing within it, according to the spec. This is the same as the
preliminary patch I posted to pgsql-patches yesterday, plus some really
grotty code in ruleutils.c to reverse-list a query tree with the correct
alias name depending on context. I'd rather not have done that, but unless
we want to force another initdb for 7.1, there's no other way for now.
as previously discussed.
It makes AIX and IRIX not use DST for dates before 1970.
The following expected files need to be removed from the regression tests,
they contain wrong results and are not needed any more.
src/test/regress/expected/horology-1947-PDT.out
src/test/regress/expected/tinterval-1947-PDT.out
src/test/regress/expected/abstime-1947-PDT.out
Zeugswetter Andreas
definitions from K&R to ANSI C style, and fix broken assumption that
int and long are the same datatype. This repairs problems observed
on Alpha with regexps having between 32 and 63 states.
two transactions create the same table name concurrently, the one that
fails will complain about unique index pg_class_relname_index, rather than
about pg_type_typname_index which'll confuse most people. Free side
benefit: pg_class.reltype is correctly linked to the pg_type entry now.
It's been zero in all but the preloaded pg_class entries since who knows
when.
are now separate files "postgres.h" and "postgres_fe.h", which are meant
to be the primary include files for backend .c files and frontend .c files
respectively. By default, only include files meant for frontend use are
installed into the installation include directory. There is a new make
target 'make install-all-headers' that adds the whole content of the
src/include tree to the installed fileset, for use by people who want to
develop server-side code without keeping the complete source tree on hand.
Cleaned up a whole lot of crufty and inconsistent header inclusions.
any other client connections that may exist (which would only happen if
another client is currently in the authentication cycle). This avoids
wastage of open descriptors in a child. It might also explain peculiar
behaviors like not closing connections when expected, since the kernel
will probably not signal EOF as long as some other backend is randomly
holding open a reference to the connection, even if the client went away
long since ...
elog(ERROR) not an Assert trap, since we've downgraded out-of-memory to
elog(ERROR) not a fatal error. Also, change the hard boundary from 256Mb
to 1Gb, just so that anyone who's actually got that much memory to spare
can play with TOAST objects approaching a gigabyte.
allocated by plan nodes are not leaked at end of query. This doesn't
really matter for normal queries, but it sure does for queries invoked
repetitively inside SQL functions. Clean up some other grotty code
associated with tupdescs, and fix a few other memory leaks exposed by
tests with simple SQL functions.
original table ('OLD' table) in its join tree if OLD is referenced by
either the rule action, the rule qual, or the original query qual that
will be added to the rule action. However, we only want one instance
of the original table to be included; so beware of the possibility that
the rule action already has a jointree entry for OLD.
rather than coredumping (as prior 7.1 code did) or silently dropping the
condition (as 7.0 did). This is annoying but there doesn't seem to be
any good way to fix it, short of a major querytree restructuring.
actually) to ensure that its file access time doesn't get old enough to
tempt a /tmp directory cleaner to remove it. Still another reason we
should never have put the sockets in /tmp in the first place ...
truncating to integer. Remove regress test that checks whether
4567890123456789 can be converted to float without loss; since that's
52 bits, it's on the hairy edge of failing with IEEE float8s, and indeed
rint seems to give platform-dependent results for it.
and new root page if old root one was splitted but new root page
wasn't created.
New code is protected by FixBTree bool flag setted to FALSE, so
nothing should be affected by this untested approach.
to the use of getpwuid when running in standalone mode.
this patch allocates some persistent storage (using
strdup) to store the username obtained with getpwuid
in src/backend/main/main.c. this is necessary because
later on, getpwuid is called again (in ValidateBinary).
the man pages for getpwuid on SCO OpenServer, FreeBSD,
and Darwin all have words to this effect (this is from
the SCO OpenServer man page):
Note
====
All information is contained in a static area, so it must
be copied if it is to be saved. Otherwise, it may be
overwritten on subsequent calls to these routines.
in particular, on my platform, the storage used to hold
the pw_name from the first call is overwritten such that
it looks like an empty username. this causes a problem
later on in SetSessionUserIdFromUserName.
i'd assume this isn't a problem on most platforms because
getpwuid is called with the same UID both times, and the
same thing ends up happening to that static storage each
time. however, that's not guaranteed, and is _not_ what
happens on my platform (at least :).
this is for the version of 7.1 available via anon cvs as
of Tue Jan 23 15:14:00 2001 PST:
.../src/backend/main/main.c,v 1.37 2000/12/31 18:04:35 tgl Exp
-michael thornburgh, zenomt@armory.com
than forcing 'plain'. This probably does not matter right now, but I
think it needs to be consistent with the regular (not-functional) index
case, where attstorage is copied from the underlying table. Clean up
some other dead and infelicitous code too.
Op, so that the sequence 'a_expr Op Op a_expr' will be parsed as
a_expr Op (Op a_expr) not (a_expr Op) Op a_expr as formerly. In other
words, prefer treating user-defined operators as prefix operators to
treating them as postfix operators, when there is an ambiguity.
Also clean up a couple of other infelicities in production priority
assignment --- for example, BETWEEN wasn't being given the intended
priority, but that of AND.
bothering to check the return value --- which meant that in case the
update or delete failed because of a concurrent update, you'd not find
out about it, except by observing later that the transaction produced
the wrong outcome. There are now subroutines simple_heap_update and
simple_heap_delete that should be used anyplace that you're not prepared
to do the full nine yards of coping with concurrent updates. In
practice, that seems to mean absolutely everywhere but the executor,
because *noplace* else was checking.
attributes in a FieldSelect node --- all the places that manipulate
these work just fine with system attribute numbers. OK, it's a new
feature, so shoot me ...
eliminates a raft of portability issues, including whether sys_nerr
exists, whether the platform has any valid negative errnos, etc. The
downside is minimal: errno shouldn't ever contain an invalid value anyway,
and if it does, reasonably modern versions of strerror will not choke.
This rangecheck idea seemed good at the time, but it's clearly a net loss,
and I apologize to all concerned for having ever put it in.
rewrite of deadlock checking. Lock holder objects are now reachable from
the associated LOCK as well as from the owning PROC. This makes it
practical to find all the processes holding a lock, as well as all those
waiting on the lock. Also, clean up some of the grottier aspects of the
SHMQueue API, and cause the waitProcs list to be stored in the intuitive
direction instead of the nonintuitive one. (Bet you didn't know that
the code followed the 'prev' link to get to the next waiting process,
instead of the 'next' link. It doesn't do that anymore.)
here is the patch attached which do check in each BLOB operation, if we are
in transaction, and raise an error otherwise. This will prevent such mistakes.
--
Sincerely Yours,
Denis Perchine
of c.h altogether, and putting it into the only places that use it
(elog.c and exc.c), instead. Modify these routines to check for a
NULL or empty-string return from strerror, too, since some platforms
define strerror to return empty string for unknown errors (what a useless
definition that is ...). Clean up some cruft in ExcPrint while at it.
mixed-signs. Previous effort left way too many minus signs, and was at
least as broken as the one before that :(
Clean up "ISO-style" time interval representation to omit zero fields if
there is at least one non-zero field. Supress some leading plus signs
when not necessary for clarity.
Replace every #ifdef __CYGWIN__ block with a cleaner TIMEZONE_GLOBAL macro
defined in datetime.h.
try to push restrictions on the view down into the view subquery,
so that they can become indexscan quals or what-have-you rather than
being applied at the top level of the subquery. 7.0 and before were
able to do this, though in a much klugier way, and I'd hate to have
anyone complaining that 7.1 is stupider than 7.0 ...
using POSIX semaphores more robust on Darwin 1.2/Mac OS X
Public Beta. this is for the version of 7.1 available
via anon cvs as of Jan 14 2001 14:00 PST.
since the semaphores and shared memory created by this
emulator are shared with the backends via fork(), their
persistent names are not necessary. removing their
names with shm_unlink() and sem_unlink() after creation
obviates the need for any "ipcclean" function. further,
without these changes, the shared memory (and, therefore,
the semaphores) will not be re-initialized/re-created after
the first execution of the postmaster, until reboot
or until some (non-existent) ipcclean function is executed.
this patch does the following:
1) if the shared memory segment "SysV_Sem_Info" already
existed, it is cleaned up. it shouldn't be there anyways.
2) the real indicator for whether the shared memory/semaphore
emulator has been initialized is if "SemInfo" has been
initialized. the shared memory and semaphores must be
initialized regardless of whether there was a garbage shared
memory segment lying around.
3) the shared memory segment "SysV_Sem_Info" is created with "O_EXCL"
to catch the case where two postmasters might be starting
simultaneously, so they don't both end up with the same shared
memory (one will fail). note that this can't be done with the
semaphores because Darwin 1.2 has a bug where attempting to
open an existing semaphore with "O_EXCL" set will ruin the
semaphore until the next reboot.
4) the shared memory segment "SysV_Sem_Info" is unlinked after
it is created. it will then exist without a name until the
postmaster and all backend children exit.
5) all semaphores are unlinked after they are created. they'll
then exist without names until the postmaster and all backend
children exit.
-michael thornburgh, zenomt@armory.com
Not sure why some were this way, and others were already correct, but it
seems to have been like this for several years.
This caused problems on a few damaged platforms like AIX and IRIX which do
not support DST calculations for years before 1970.
Thanks to Andreas Zeugswetter <ZeugswetterA@wien.spardat.at> for finding
the problem.
I hope. I finally realized that we were going at it backwards: when
there are excess parentheses, they need to be treated as part of the
sub-SELECT, not as part of the surrounding expression. Although either
choice yields an unambiguous grammar, only this way produces a grammar
that is LALR(1). With the old approach we were guaranteed to fail on
either 'SELECT (((SELECT 2)) + 3)' or
'SELECT (((SELECT 2)) UNION SELECT 2)' depending on which way we
resolve the initial shift/reduce conflict. With the new way, the same
reduction track can be followed in both cases until we have advanced
far enough to know whether we are done with the sub-SELECT or not.
given the fundamental restriction of not looking at transaction commit
data in pg_log. Use code that is actually based on tqual.c rather than
ad-hoc tests. Also write the tuple fetch loop using standard access
macros rather than ad-hoc code.
are treated more like 'cancel' interrupts: the signal handler sets a
flag that is examined at well-defined spots, rather than trying to cope
with an interrupt that might happen anywhere. See pghackers discussion
of 1/12/01.
are now critical sections, so as to ensure die() won't interrupt us while
we are munging shared-memory data structures. Avoid insecure intermediate
states in some code that proc_exit will call, like palloc/pfree. Rename
START/END_CRIT_CODE to START/END_CRIT_SECTION, since that seems to be
what people tend to call them anyway, and make them be called with () like
a function call, in hopes of not confusing pg_indent.
I doubt that this is sufficient to make SIGTERM safe anywhere; there's
just too much code that could get invoked during proc_exit().
1. Support of variable size keys - new algorithm of insertion to tree
(GLI - gist layrered insertion). Previous algorithm was implemented
as described in paper by Joseph M. Hellerstein et.al
"Generalized Search Trees for Database Systems". This (old)
algorithm was not suitable for variable size keys and could be
not effective ( walking up-down ) in case of multiple levels split
Bug fixed:
1. fixed bug in gistPageAddItem - key values were written to disk
uncompressed. This caused failure if decompression function
does real job.
2. NULLs handling - we keep NULLs in tree. Right way is to remove them,
but we don't know how to inform vacuum about index statistics. This is
just cosmetic warning message (like in case with R-Tree),
but I'm not sure how to recognize real problem if we remove NULLs
and suppress this warning as Tom suggested.
3. various memory leaks
This work was done by Teodor Sigaev (teodor@stack.net) and
Oleg Bartunov (oleg@sai.msu.su).
- no more elog(STOP) in StartupXLOG();
- both checkpoint' undo & redo are used to define
oldest on-line log file.
2. Ability to pre-allocate a few log files at checkpoint time
(wal_files option). Off by default.
as both a GROUP BY item and an output expression, the top-level Group
node should just copy up the evaluated expression value from its input,
rather than re-evaluating the expression. Aside from any performance
benefit this might offer, this avoids a crash when there is a sub-SELECT
in said expression.
before calling RelationInvalidateHeapTuple(), which is bad because the
latter needs to look at the tuple data, which is in the shared disk
buffer. If another backend manages to recycle the buffer while this
is going on, we will compute the wrong hashindex for the tuple or
maybe even crash outright. Must hold buffer refcount until afterwards.
(This bug is not in 7.0.*; seems to be have introduced during WAL changes.)
and burn. Just for added luck, change reading of CONST nodes so that
we do not need to consult pg_type rows while reading them; this means
that no database access occurs during stringToNode. This requires
changing the order in which const-node fields are written, which means
an initdb is forced.
in per-entry sub-memory-context, where they were supposed to go, rather
than in CacheMemoryContext where the code was putting them. Must've
suffered a severe brain fade when I wrote this :-(
sequences. This is done by disabling multi-byte awareness when it's
not necessary. This is kind of a workaround, not a perfect solution.
However, there is no ideal way to parse broken multi-byte character
sequences. So I guess this is the best way what we could do right
now...
and revert documentation to describe the existing INHERITS clause
instead, per recent discussion in pghackers. Also fix implementation
of SQL_inheritance SET variable: it is not cool to look at this var
during the initial parsing phase, only during parse_analyze(). See
recent bug report concerning misinterpretation of date constants just
after a SET TIMEZONE command. gram.y really has to be an invariant
transformation of the query string to a raw parsetree; anything that
can vary with time must be done during parse analysis.
Previous result did not have correct month boundaries so anything near edge
cases was suspect (e.g. April was in Q1 and July, August were lumped into
Q2).
Thanks to Denis Osadchy <osadchy@turbo.nsk.su> for the report.
starting a new hashtable search no longer clobbers any other search
active anywhere in the system. Fix RelationCacheInvalidate() so that
it will not crash or go into an infinite loop if invoked recursively,
as for example by a second SI Reset message arriving while we are still
processing a prior one.
In theory we should always get EEXIST if there's a key collision, but
if the kernel code tests error conditions in a weird order, perhaps
EACCES or EIDRM could occur too.
assume that TAS() will always succeed the first time, even if the lock
is known to be free. Also, make sure that code will eventually time out
and report a stuck spinlock, rather than looping forever. Small cleanups
in s_lock.h, too.
1. Distinguish cases where a Datum representing a tuple datatype is an OID
from cases where it is a pointer to TupleTableSlot, and make sure we use
the right typlen in each case.
2. Make fetchatt() and related code support 8-byte by-value datatypes on
machines where Datum is 8 bytes. Centralize knowledge of the available
by-value datatype sizes in two macros in tupmacs.h, so that this will be
easier if we ever have to do it again.
table that inherits from a temp table. Make sure the right things happen
if one creates a temp table, creates another temp that inherits from it,
then renames the first one. (Previously, system would end up trying to
delete the temp tables in the wrong order.)
recommendation from Paul Vixie. Add a new abbrev() function to produce
abbreviated format as text. No forced initdb, but new function is not
available unless you do an initdb or add the pg_proc row manually.
level" locks. A session lock is not released at transaction commit (but it
is released on transaction abort, to ensure recovery after an elog(ERROR)).
In VACUUM, use a session lock to protect the master table while vacuuming a
TOAST table, so that the TOAST table can be done in an independent
transaction.
I also took this opportunity to do some cleanup and renaming in the lock
code. The previously noted bug in ProcLockWakeup, that it couldn't wake up
any waiters beyond the first non-wakeable waiter, is now fixed. Also found
a previously unknown bug of the same kind (failure to scan all members of
a lock queue in some cases) in DeadLockCheck. This might have led to failure
to detect a deadlock condition, resulting in indefinite waits, but it's
difficult to characterize the conditions required to trigger a failure.
applied to the duplicated subtree twice. Probably someday we should
fix the parser not to generate multiple links to the same subtree,
but for now a quick copyObject() is the path of least resistance.
observed by Inoue. Also, don't call ProcRemove() from postmaster if we
have detected a backend crash --- too risky if shared memory is corrupted.
It's not needed anyway, considering we are going to reinitialize shared
memory and semaphores as soon as the last child is dead.
>> xlog.c : special case for beos to avoid 'link' which does not work yet
>> beos/sem.c : implementation of new sem_ctl call (GETPID) and a new
>sem_op
>> flag (IPCNOWAIT)
>> dynloader/beos.c : add a verification of symbol validity (seem that
the
>> loader sometime return OK with an invalid symbol)
>> postmaster.c : add beos forking support for the new checkpoint
process
>> postgres.c : remove beos special case for getrusage
>> beos.h : Correction of a bas definition of AF_UNIX, misc defnitions
>>
>>
>> thanks
>>
>>
>> cyril
Cyril VELTER
might change it. Experimentation shows that the signal handler call
mechanism does not save/restore errno for you, at least not on Linux
or HPUX, so this is definitely a real risk.
to ensure that we have released buffer refcounts and so forth, rather than
putting ad-hoc operations before (some of the calls to) proc_exit. Add
commentary to discourage future hackers from repeating that mistake.
> Date: Thu, 14 Dec 2000 12:44:47 +0100 (CET)
> From: Kovacs Zoltan Sandor <tip@pc10.radnoti-szeged.sulinet.hu>
> To: pgsql-bugs@postgresql.org
> Subject: [BUGS] to_char() causes backend to close connection
>
> Hi, this query gives different strange results:
>
> select to_char(now()::abstime,'YYMMDDHH24MI');
>
> I get e.g. a "backend closed the channel unexpectedly..." error with
> successful or failed resetting attempt (indeterministic)
Again thanks Kovacs, you found really designing bug, that appear
if anyone write bad format template to "number" version of to_char()
(as you with 'DD').
Karel
comparison does not consider paths different when they differ only in
uninteresting aspects of sort order. (We had a special case of this
consideration for indexscans already, but generalize it to apply to
ordered join paths too.) Be stricter about what is a canonical pathkey
to allow faster pathkey comparison. Cache canonical pathkeys and
dispersion stats for left and right sides of a RestrictInfo's clause,
to avoid repeated computation. Total speedup will depend on number of
tables in a query, but I see about 4x speedup of planning phase for
a sample seven-table query.
OIDs rather than names. Aside from being simpler and faster, this way
doesn't blow up in the face of 'create temp table foo () inherits (foo)'.
Which is a rather odd thing to do, but it seems some people want to.
avoid repeated evaluations in cost_qual_eval(). This turns out to save
a useful fraction of planning time. No change to external representation
of RestrictInfo --- although that node type doesn't appear in stored
rules anyway.
varlena type. (I did not force initdb, but you won't see the fix
unless you do one.) Also, make sure all index support operators and
functions are careful not to leak memory for toasted inputs; I had
missed some hash and rtree support ops on this point before.
value greater than one. The behavior this sought to disallow doesn't
seem any less confusing than the other behaviors of cached sequences.
Improve wording of some error messages, too.
Update documentation accordingly. Also add an explanation that
aborted transactions do not roll back their nextval() calls; this
seems to be a FAQ, so it ought to be mentioned here...
As I read it, the spec requires a non-null result in some cases where
one of the inputs is NULL: specifically, if the other endpoint of that
interval is between the endpoints of the other interval, then the result
is known TRUE despite the missing endpoint. The spec could've been a
lot simpler if they did not intend this behavior.
I did not force an initdb for this change, but if you don't do one you'll
still see the old strict-function behavior.
work where we can (given that the executor only handles it at top level)
and generate an error where we can't. Note that while the parser has
been allowing views to say SELECT FOR UPDATE for a few weeks now, that
hasn't actually worked until just now.
an error as we used to. In an OUTER JOIN scenario, retrieving a null
CTID from one of the input relations is entirely expected. We still
want to lock the input rows from the other relations, so just ignore
the null and keep going.
I believe this should fix the issue that Philip Warner
noticed about the check for unique constraints meeting the
referenced keys of a foreign key constraint allowing the
specification of a subset of a foreign key instead of
rejecting it. I also added tests for a base case of
this to the foreign key and alter table tests and patches
for expected output.
report from Joel Burton. Turns out that my simple idea of turning the
SELECT into a subquery does not interact well *at all* with the way the
rule rewriter works. Really what we need to make INSERT ... SELECT work
cleanly is to decouple targetlists from rangetables: an INSERT ... SELECT
wants to have two levels of targetlist but only one rangetable. No time
for that for 7.1, however, so I've inserted some ugly hacks to make the
rewriter know explicitly about the structure of INSERT ... SELECT queries.
Ugh :-(
Allow some operator-like tokens to be used as function names.
Flesh out support for time, timetz, and interval operators
and interactions.
Regression tests pass, but non-reference-platform horology test results
will need to be updated.
since those routines may do palloc's. We want to be fairly sure we can
send the error message to the client even under low-memory conditions.
That's what we stashed away 8K in ErrorContext for, after all ...
not-very-good handling of mid-size allocation requests. Do everything via
either the "small" case (chunk size rounded up to power of 2) or the "large"
case (pass it straight off to malloc()). Increase the number of freelists
a little to set the breakpoint between these behaviors at 8K.