varlena type. (I did not force initdb, but you won't see the fix
unless you do one.) Also, make sure all index support operators and
functions are careful not to leak memory for toasted inputs; I had
missed some hash and rtree support ops on this point before.
both MULTIBYTE and TOAST prevent char(n) from being truly fixed-size.
Simplify and speed up fastgetattr() and index_getattr() macros by
eliminating special cases for attnum=1. It's just as fast to handle
the first attribute by presetting its attcacheoff to zero; so do that
instead when loading the tupledesc in relcache.c.
included by everything that includes bufmgr.h --- it's supposed to be
internals, after all, not part of the API! This fixes the conflict
against FreeBSD headers reported by Rosenman, by making it unnecessary
for s_lock.h to be included by plperl.c.
IPC key assignment will now work correctly even when multiple postmasters
are using same logical port number (which is possible given -k switch).
There is only one shared-mem segment per postmaster now, not 3.
Rip out broken code for non-TAS case in bufmgr and xlog, substitute a
complete S_LOCK emulation using semaphores in spin.c. TAS and non-TAS
logic is now exactly the same.
When deadlock is detected, "Deadlock detected" is now the elog(ERROR)
message, rather than a NOTICE that comes out before an unhelpful ERROR.
re-adopt these settings at every postmaster or standalone-backend startup.
This should fix problems with indexes becoming corrupt due to failure to
provide consistent locale environment for postmaster at all times. Also,
refuse to start up a non-locale-enabled compilation in a database originally
initdb'd with a non-C locale. Suppress LIKE index optimization if locale
is not "C" or "POSIX" (are there any other locales where it's safe?).
Issue NOTICE during initdb if selected locale disables LIKE optimization.
maintained for each cache entry. A cache entry will not be freed until
the matching ReleaseSysCache call has been executed. This eliminates
worries about cache entries getting dropped while still in use. See
my posting to pg-hackers of even date for more info.
Context diff this time.
Remove -m486 compile args for FreeBSD-i386, compile -O2 on i386.
Compile with only -O on alpha for codegen safety.
Make the port use the TEST_AND_SET for alpha and i386 on FreeBSD.
Fix a lot of bogus string formats for outputting pointers (cast to int
and %u/%x replaced with no cast and %p), and 'Size'(size_t) are now
cast to 'unsigned long' and output with %lu/
Remove an unused variable.
Alfred Perlstein
message about recursive use of a syscache. Also remove most of the
specialized indexscan routines in indexing.c --- it turns out that
catcache.c is perfectly able to perform the indexscan for itself,
in fact has already looked up all the information needed to do so!
This should be faster as well as needing far less boilerplate code.
(WAL logging for this is not done yet, however.) Clean up a number of really
crufty things that are no longer needed now that DROP behaves nicely. Make
temp table mapper do the right things when drop or rename affecting a temp
table is rolled back. Also, remove "relation modified while in use" error
check, in favor of locking tables at first reference and holding that lock
throughout the statement.
'AbortTransaction and not in in-progress state' when client disconnects
just after an error. Notice seems pretty harmless, so I'm not going
to worry about back-patching this into 7.0.* ...
source, due to addition of header overhead), store it as plain data
rather than pseudo-compressed data. This saves a few microseconds when
reading it out, but much more importantly guarantees that the toaster
won't actually expand tuples that contain incompressible data. That's
essential to avoid 'Tuple too big' failures with large objects.
from bufmgr - it would be nice to have separate hash in smgr
for node <--> fd mappings, but for the moment it's easy to
add new hash to relcache.
Fixed small bug in xlog.c:ReadRecord.
* Makefile: Add more standard targets. Improve shell redirection in GNU
make detection.
* src/backend/access/transam/rmgr.c: Fix incorrect(?) C.
* src/backend/libpq/pqcomm.c (StreamConnection): Work around accept() bug.
* src/include/port/unixware.h: ...with help from here.
* src/backend/nodes/print.c (plannode_type): Remove some "break"s after
"return"s.
* src/backend/tcop/dest.c (DestToFunction): ditto.
* src/backend/nodes/readfuncs.c: Add proper prototypes.
* src/backend/utils/adt/numutils.c (pg_atoi): Cope specially with strtol()
setting EINVAL. This saves us from creating an extra set of regression test
output for the affected systems.
* src/include/storage/s_lock.h (tas): Correct prototype.
* src/interfaces/libpq/fe-connect.c (parseServiceInfo): Don't use variable
as dimension in array definition.
* src/makefiles/Makefile.unixware: Add support for GCC.
* src/template/unixware: same here
* src/test/regress/expected/abstime-solaris-1947.out: Adjust whitespace.
* src/test/regress/expected/horology-solaris-1947.out: Part of this file
was evidently missing.
* src/test/regress/pg_regress.sh: Fix shell. mkdir -p returns non-zero if
the directory exists.
* src/test/regress/resultmap: Add entries for Unixware.
trying to toast tuples inserted into toast tables! Fix is two-pronged:
first, ensure all columns of a toast table are marked attstorage='p',
and second, alter the target chunk size so that it's less than the
threshold for trying to toast a tuple. (Code tried to do that but the
expression was wrong.) A few cosmetic cleanups in tuptoaster too.
NOTE: initdb forced due to change in toaster chunk-size.
These two routines will now ALWAYS elog() on failure, whether you ask for
a lock or not. If you really want to get a NULL return on failure, call
the new routines heap_open_nofail()/heap_openr_nofail(). By my count there
are only about three places that actually want that behavior. There were
rather more than three places that were missing the check they needed to
make under the old convention :-(.
allows fixing problems with operators that expected to be able to
return a NULL, such as the '#' line-segment-intersection operator
that tried to return NULL when the two segments don't intersect.
(See, eg, bug report from 1-Nov-99 on pghackers.) Fix some other
bugs in passing, such as backwards comparison in path_distance().
actually, but who could understand it with no comments? Fix bug
while at it: _bt_orderkeys would try to invoke comparisons on
NULL inputs, given the right sort of redundant quals.
left keys during bottom-up index build, and leave some free space
instead of packing the pages to the brim (so as to avoid vast numbers
of page splits during the first interactive insertions).
duplicate keys by letting search go to the left rather than right when an
equal key is seen at an upper tree level. Fix poor choice of page split
point (leading to insertion failures) that was forced by chaining logic.
Don't store leftmost key in non-leaf pages, since it's not necessary.
Don't create root page until something is first stored in the index, so an
unused index is now 8K not 16K. (Doesn't seem to be as easy to get rid of
the metadata page, unfortunately.) Massive cleanup of unreadable code,
fix poor, obsolete, and just plain wrong documentation and comments.
See src/backend/access/nbtree/README for the gory details.
pass-by-ref data types --- eg, an index on lower(textfield) --- no longer
leak memory during index creation or update. Clean up a lot of redundant
code ... did you know that copy, vacuum, truncate, reindex, extend index,
and bootstrap each basically duplicated the main executor's logic for
extracting information about an index and preparing index entries?
Functional indexes should be a little faster now too, due to removal
of repeated function lookups.
CREATE INDEX 'opt_type' clause is deimplemented by these changes,
but I haven't removed it from the parser yet (need to merge with
Thomas' latest change set first).
memory contexts. Currently, only leaks in expressions executed as
quals or projections are handled. Clean up some old dead cruft in
executor while at it --- unused fields in state nodes, that sort of thing.
Don't use DISABLE_COMPLEX_MACRO on Solaris. Don't define the
replacement function in the header file. Use -KPIC, not -K PIC.
Use CC to link C++ libraries, not ld/ar.
Eliminate file not found warnings in tcl build code.
for details). It doesn't really do that much yet, since there are no
short-term memory contexts in the executor, but the infrastructure is
in place and long-term contexts are handled reasonably. A few long-
standing bugs have been fixed, such as 'VACUUM; anything' in a single
query string crashing. Also, out-of-memory is now considered a
recoverable ERROR, not FATAL.
Eliminate a large amount of crufty, now-dead code in and around
memory management.
Fix problem with holding off SIGTRAP, SIGSEGV, etc in postmaster and
backend startup.
entries now for int8 and network hash indexes. int24_ops and int42_ops
are gone. pg_opclass no longer contains multiple entries claiming to be
the default opclass for the same datatype. opr_sanity regress test
extended to catch errors like these in the future.
materialized tupleset is small enough) instead of a temporary relation.
This was something I was thinking of doing anyway for performance, and Jan
says he needs it for TOAST because he doesn't want to cope with toasting
noname relations. With this change, the 'noname table' support in heap.c
is dead code, and I have accordingly removed it. Also clean up 'noname'
plan handling in planner --- nonames are either sort or materialize plans,
and it seems less confusing to handle them separately under those names.
passing the index-is-unique flag to index build routines (duh! ...
why wasn't it done this way to begin with?). Aside from eliminating
an eyesore, this should save a few milliseconds in btree index creation
because a full scan of pg_index is not needed any more.
--- ie, they're only called for side-effects. Add a PG_RETURN_VOID()
macro and use it where appropriate. This probably doesn't change the
machine code by a single bit ... it's just for documentation.
inputs have been converted to newstyle. This should go a long way towards
fixing our portability problems with platforms where char and short
parameters are passed differently from int-width parameters. Still
more to do for the Alpha port however.
it will close VFDs if necessary to surmount ENFILE or EMFILE failures.
Make use of this in md.c, xlog.c, and user.c routines that were
formerly vulnerable to these failures. In particular, this should
handle failures of mdblindwrt() that have been observed under heavy
load conditions. (By golly, every other process on the system may
crash after Postgres eats up all the kernel FDs, but Postgres will
keep going!)
That means you can now set your options in either or all of $PGDATA/configuration,
some postmaster option (--enable-fsync=off), or set a SET command. The list of
options is in backend/utils/misc/guc.c, documentation will be written post haste.
pg_options is gone, so is that pq_geqo config file. Also removed were backend -K,
-Q, and -T options (no longer applicable, although -d0 does the same as -Q).
Added to configure an --enable-syslog option.
changed all callers from TPRINTF to elog(DEBUG)
key call sites are changed, but most called functions are still oldstyle.
An exception is that the PL managers are updated (so, for example, NULL
handling now behaves as expected in plperl and plpgsql functions).
NOTE initdb is forced due to added column in pg_proc.
project I am working on (Recall - a distributed, fault-tolerant,
replicated, storage framework @ http://www.fault-tolerant.org).
Recall is written in C++. I need to include the postgres headers and
there are some problems when including the headers w/C++.
Attached is a patch generated from postgres/src that fixes my problems.
I was hoping to get this into the main source. It's very small (2k) and
3 files are changed: backend/utils/fmgr/fmgr.c,
backend/utils/Gen_fmgrtab.sh.in, and include/access/tupdesc.h.
In C++, you get a multiply defined symbol because the variable
(FmgrInfo *fmgr_pl_finfo) is defined in the header (the patch moves it
to the .c file). The other problem in tupdesc.h is the use of typeid
is a problem in c++ (I renamed it to oidtypeid).
Thanks,
Neal Norwitz
really ought to fix relcache entry construction so that it does not
do so much with CurrentMemoryContext = CacheCxt. As is, relatively
harmless leaks in either sequential or index scanning translate to
permanent leaks if they occur when called from relcache build.
For the moment, however, the path of least resistance is to repair
all such leaks...
as a shared dirtybit for each shared buffer. The shared dirtybit still
controls writing the buffer, but the local bit controls whether we need
to fsync the buffer's file. This arrangement fixes a bug that allowed
some required fsyncs to be missed, and should improve performance as well.
For more info see my post of same date on pghackers.
In the event of an elog() while the mode was set to immediate write,
there was no way for it to be set back to the normal delayed write.
The mechanism was a waste of space and cycles anyway, since the only user
was varsup.c, which could perfectly well call FlushBuffer directly.
Now it does just that, and the notion of a write mode is gone.
running gcc and HP's cc with warnings cranked way up. Signed vs unsigned
comparisons, routines declared static and then defined not-static,
that kind of thing. Tedious, but perhaps useful...
(Subj: [PORTS] initdb problem on NT with 7.0). Since nobody helped me,
I had to find out the reson. The difference between NT and Linux (for
instance) is that "open( path, O_RDWR );" opens a file in text mode. So
sometime less block can be read than required.
I suggest a following patch. BTW the situation appeared before, see
hba.c, pqcomm.c and others.
Alexei Zakharov
appropriate btree three-way comparison routine. Not clear why the
three-way comparison routines were being used in some paths and not
others in btree --- incomplete changes by someone long ago, maybe?
Anyway, this makes for a nice speedup in CREATE INDEX.
syscache and relcache flushes). Relcache entry rebuild now preserves
original tupledesc, rewrite rules, and triggers if possible, so that pointers
to these things remain valid --- if these things change while relcache entry
has positive refcount, we elog(ERROR) to avoid later crash. Arrange for
xact-local rels to be rebuilt when an SI inval message is seen for them,
so that they are updated by CommandCounterIncrement the same as regular rels.
(This is useful because of Hiroshi's recent changes to process our own SI
messages at CommandCounterIncrement time.) This allows simplification of
some routines that previously hacked around the lack of an automatic update.
catcache now keeps its own copy of tupledesc for its relation, rather than
depending on the relcache's copy; this avoids needing to reinitialize catcache
during a cache flush, which saves some cycles and eliminates nasty circularity
problems that occur if a cache flush happens while trying to initialize a
catcache.
Eliminate a number of permanent memory leaks that used to happen during
catcache or relcache flush; not least of which was that catcache never
freed any cached tuples! (Rule parsetree storage is still leaked, however;
will fix that separately.)
Nothing done yet about code that uses tuples retrieved by SearchSysCache
for longer than is safe.
Initdb help correction
Changed end/abort to commit/rollback and changed related notices
Commented out way old printing functions in libpq
Fixed a typo in alter table / alter column
pghackers discussion of 5-Jan-2000. The amopselect and amopnpages
estimators are gone, and in their place is a per-AM amcostestimate
procedure (linked to from pg_am, not pg_amop).
from a constraint condition does not violate the constraint (cf. discussion
on pghackers 12/9/99). Implemented by adding a parameter to ExecQual,
specifying whether to return TRUE or FALSE when the qual result is
really NULL in three-valued boolean logic. Currently, ExecRelCheck is
the only caller that asks for TRUE, but if we find any other places that
have the wrong response to NULL, it'll be easy to fix them.
relcache entry no longer leaks a small amount of memory. index_endscan
now releases all the memory acquired by index_beginscan, so callers of it
should NOT pfree the scan descriptor anymore.
Make all system indexes unique.
Make all cache loads use system indexes.
Rename *rel to *relid in inheritance tables.
Rename cache names to be clearer.
eliminating some wildly inconsistent coding in various parts of the
system. I set MAXPGPATH = 1024 in config.h.in. If anyone is really
convinced that there ought to be a configure-time test to set the
value, go right ahead ... but I think it's a waste of time.
when an initdb-forcing change has been applied within a development cycle.
PG_VERSION serves this purpose for official releases, but we can't bump
the PG_VERSION number every time we make a change to the catalogs during
development. Instead, increase the catalog version number to warn other
developers that you've made an incompatible change. See my mail to
pghackers for more info.
a generalized module 'tuplesort.c' that can sort either HeapTuples or
IndexTuples, and is not tied to execution of a Sort node. Clean up
memory leakages in sorting, and replace nbtsort.c's private implementation
of mergesorting with calls to tuplesort.c.
expressions in CREATE TABLE. There is no longer an emasculated expression
syntax for these things; it's full a_expr for constraints, and b_expr
for defaults (unfortunately the fact that NOT NULL is a part of the
column constraint syntax causes a shift/reduce conflict if you try a_expr.
Oh well --- at least parenthesized boolean expressions work now). Also,
stored expression for a column default is not pre-coerced to the column
type; we rely on transformInsertStatement to do that when the default is
actually used. This means "f1 datetime default 'now'" behaves the way
people usually expect it to.
BTW, all the support code is now there to implement ALTER TABLE ADD
CONSTRAINT and ALTER TABLE ADD COLUMN with a default value. I didn't
actually teach ALTER TABLE to call it, but it wouldn't be much work.
Implements the CREATE CONSTRAINT TRIGGER and SET CONSTRAINTS commands.
TODO:
Generic builtin trigger procedures
Automatic execution of appropriate CREATE CONSTRAINT... at CREATE TABLE
Support of new trigger type in pg_dump
Swapping of huge # of events to disk
Jan
is used to find start scan position of Indexscan-s.
To speed up finding scan start position,I have changed
_bt_first() to use as many keys as possible.
I'll attach the patch here.
Regards.
Hiroshi Inoue
* Buffer refcount cleanup (per my "progress report" to pghackers, 9/22).
* Add links to backend PROC structs to sinval's array of per-backend info,
and use these links for routines that need to check the state of all
backends (rather than the slow, complicated search of the ShmemIndex
hashtable that was used before). Add databaseOID to PROC structs.
* Use this to implement an interlock that prevents DESTROY DATABASE of
a database containing running backends. (It's a little tricky to prevent
a concurrently-starting backend from getting in there, since the new
backend is not able to lock anything at the time it tries to look up
its database in pg_database. My solution is to recheck that the DB is
OK at the end of InitPostgres. It may not be a 100% solution, but it's
a lot better than no interlock at all...)
* In ALTER TABLE RENAME, flush buffers for the relation before doing the
rename of the physical files, to ensure we don't get failures later from
mdblindwrt().
* Update TRUNCATE patch so that it actually compiles against current
sources :-(.
You should do "make clean all" after pulling these changes.
additional argument specifying the kind of lock to acquire/release (or
'NoLock' to do no lock processing). Ensure that all relations are locked
with some appropriate lock level before being examined --- this ensures
that relevant shared-inval messages have been processed and should prevent
problems caused by concurrent VACUUM. Fix several bugs having to do with
mismatched increment/decrement of relation ref count and mismatched
heap_open/close (which amounts to the same thing). A bogus ref count on
a relation doesn't matter much *unless* a SI Inval message happens to
arrive at the wrong time, which is probably why we got away with this
sloppiness for so long. Repair missing grab of AccessExclusiveLock in
DROP TABLE, ALTER/RENAME TABLE, etc, as noted by Hiroshi.
Recommend 'make clean all' after pulling this update; I modified the
Relation struct layout slightly.
Will post further discussion to pghackers list shortly.
See attached mail for more details.
-------------------------------------------------------------------
From: "Vadim Mikheev" <vadim@krs.ru>
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
References: <000201befa94$42fe04c0$2801007e@cadzone.tpf.co.jp>
Subject: Re: elog(ERROR) in vacuum
Date: Fri, 10 Sep 1999 10:27:10 +0900
Organization: OJSC Rostelecom (Krasnoyarsk)
Message-ID: <37D85E6E.5AFA126D@krs.ru>
Hiroshi Inoue wrote:
>
> Hello Vadim,
>
> I have a question about vacuum.
>
> VACUUM has a phase like commit which calls TransactionIdCommit().
> But if elog(ERROR) occured after that,the status of transaction is
> changed from XID_COMMIT to XID_ABORT.
>
> Seems to me this causes inconsistency.
> Shoudn't AbortTransaction() be changed not to call TransacionIdAbort()
> in case of vacuum.
You're right!
As usual -:)
Vadim
transaction abort --- before it only worked if there was exactly one level
of allocation context stacked in the blank portal. Now it does the right
thing for any depth, including zero...
has positive refcount, it is rebuilt from pg_class data. This ensures
that relcache entries will track changes made by other backends. Formerly,
a shared inval report would just be ignored if it happened to arrive while
the relcache entry was in use. Also, fix relcache to reset ref counts
to zero during transaction abort. Finally, change LockRelation() so that
it checks for shared inval reports after obtaining the lock. In this way,
once any kind of lock has been obtained on a rel, we can trust the relcache
entry to be up-to-date.
Also, move responsibility for calling vc_abort into main xact.c list of
things-to-call-at-abort. What in the world was it doing down inside of
TransactionIdAbort()?
care of equal-key cases, eliminating bt_firsteq(). The linear search
formerly done by bt_firsteq() took a lot of time in the case where many
equal keys appear on the same page.
this one could be useful for people experiencing out-of-memory crashes while
executing queries which retrieve or use a very large number of tuples.
The problem happens when storage is allocated for functions results used in
a large query, for example:
select upper(name) from big_table;
select big_table.array[1] from big_table;
select count(upper(name)) from big_table;
This patch is a dirty hack that fixes the out-of-memory problem for the most
common cases, like the above ones. It is not the final solution for the
problem but it can work for some people, so I'm posting it.
The patch should be safe because all changes are under #ifdef. Furthermore
the feature can be enabled or disabled at runtime by the `free_tuple_memory'
options in the pg_options file. The option is disabled by default and must
be explicitly enabled at runtime to have any effect.
To enable the patch add the follwing line to Makefile.custom:
CUSTOM_COPT += -DFREE_TUPLE_MEMORY
To enable the option at runtime add the following line to pg_option:
free_tuple_memory=1
Massimo