default, but OIDS are removed from many system catalogs that don't need them.
Some interesting side effects: TOAST pointers are 20 bytes not 32 now;
pg_description has a three-column key instead of one.
Bugs fixed in passing: BINARY cursors work again; pg_class.relhaspkey
has some usefulness; pg_dump dumps comments on indexes, rules, and
triggers in a valid order.
initdb forced.
per previous discussion on pghackers. Most of the duplicate code in
different AMs' ambuild routines has been moved out to a common routine
in index.c; this means that all index types now do the right things about
inserting recently-dead tuples, etc. (I also removed support for EXTEND
INDEX in the ambuild routines, since that's about to go away anyway, and
it cluttered the code a lot.) The retail indextuple deletion routines have
been replaced by a "bulk delete" routine in which the indexscan is inside
the access method. I haven't pushed this change as far as it should go yet,
but it should allow considerable simplification of the internal bookkeeping
for deletions. Also, add flag columns to pg_am to eliminate various
hardcoded tests on AM OIDs, and remove unused pg_am columns.
Fix rtree and gist index types to not attempt to store NULLs; before this,
gist usually crashed, while rtree managed not to crash but computed wacko
bounding boxes for NULL entries (which might have had something to do with
the performance problems we've heard about occasionally).
Add AtEOXact routines to hash, rtree, and gist, all of which have static
state that needs to be reset after an error. We discovered this need long
ago for btree, but missed the other guys.
Oh, one more thing: concurrent VACUUM is now the default.
validity checking rules for VACUUM. Make some other rearrangements of the
VACUUM code to allow more code to be shared between full and lazy VACUUM.
Minor code cleanups and added comments for TransactionId manipulations.
This makes VACUUM work properly with partial indexes, and avoids memory
leakage with functional indexes. Also, suppress complaint about fewer
index tuples than heap tuples when the index is a partial index.
From Martijn van Oosterhout.
useful as yet, since its primary source of information is (full) VACUUM,
which makes a concerted effort to get rid of free space before telling
the map about it ... next stop is concurrent VACUUM ...
have any newly-dead tuples on them. This is a longstanding deficiency
that prevents VACUUM from compacting a file as much as one would expect.
Change requires fixing repair_frag to not assume that fraged_pages is
a subset of vacuum_pages.
Also make some further cleanups of places that assumed page numbers fit
in int and tuple counts fit in uint32.
do anything yet, but it has the necessary connections to initialization
and so forth. Make some gestures towards allowing number of blocks in
a relation to be BlockNumber, ie, unsigned int, rather than signed int.
(I doubt I got all the places that are sloppy about it, yet.) On the
way, replace the hardwired NLOCKS_PER_XACT fudge factor with a GUC
variable.
database, including system catalogs (but not the shared catalogs,
since they don't really belong to his database). This is per recent
mailing list discussion. Clean up some other code that also checks
for database ownerness by introducing a test function is_dbadmin().
Python) to support shared extension modules, I have learned that Guido
prefers the style of the attached patch to solve the above problem.
I feel that this solution is particularly appropriate in this case
because the following:
PglargeType
PgType
PgQueryType
are already being handled in the way that I am proposing for PgSourceType.
Jason Tishler
a separate statement (though it can still be invoked as part of VACUUM, too).
pg_statistic redesigned to be more flexible about what statistics are
stored. ANALYZE now collects a list of several of the most common values,
not just one, plus a histogram (not just the min and max values). Random
sampling is used to make the process reasonably fast even on very large
tables. The number of values and histogram bins collected is now
user-settable via an ALTER TABLE command.
There is more still to do; the new stats are not being used everywhere
they could be in the planner. But the remaining changes for this project
should be localized, and the behavior is already better than before.
A not-very-related change is that sorting now makes use of btree comparison
routines if it can find one, rather than invoking '<' twice.
are treated more like 'cancel' interrupts: the signal handler sets a
flag that is examined at well-defined spots, rather than trying to cope
with an interrupt that might happen anywhere. See pghackers discussion
of 1/12/01.
are now critical sections, so as to ensure die() won't interrupt us while
we are munging shared-memory data structures. Avoid insecure intermediate
states in some code that proc_exit will call, like palloc/pfree. Rename
START/END_CRIT_CODE to START/END_CRIT_SECTION, since that seems to be
what people tend to call them anyway, and make them be called with () like
a function call, in hopes of not confusing pg_indent.
I doubt that this is sufficient to make SIGTERM safe anywhere; there's
just too much code that could get invoked during proc_exit().
table that inherits from a temp table. Make sure the right things happen
if one creates a temp table, creates another temp that inherits from it,
then renames the first one. (Previously, system would end up trying to
delete the temp tables in the wrong order.)
level" locks. A session lock is not released at transaction commit (but it
is released on transaction abort, to ensure recovery after an elog(ERROR)).
In VACUUM, use a session lock to protect the master table while vacuuming a
TOAST table, so that the TOAST table can be done in an independent
transaction.
I also took this opportunity to do some cleanup and renaming in the lock
code. The previously noted bug in ProcLockWakeup, that it couldn't wake up
any waiters beyond the first non-wakeable waiter, is now fixed. Also found
a previously unknown bug of the same kind (failure to scan all members of
a lock queue in some cases) in DeadLockCheck. This might have led to failure
to detect a deadlock condition, resulting in indefinite waits, but it's
difficult to characterize the conditions required to trigger a failure.
maintained for each cache entry. A cache entry will not be freed until
the matching ReleaseSysCache call has been executed. This eliminates
worries about cache entries getting dropped while still in use. See
my posting to pg-hackers of even date for more info.
Context diff this time.
Remove -m486 compile args for FreeBSD-i386, compile -O2 on i386.
Compile with only -O on alpha for codegen safety.
Make the port use the TEST_AND_SET for alpha and i386 on FreeBSD.
Fix a lot of bogus string formats for outputting pointers (cast to int
and %u/%x replaced with no cast and %p), and 'Size'(size_t) are now
cast to 'unsigned long' and output with %lu/
Remove an unused variable.
Alfred Perlstein
for views. Views are now have a "relkind" of
RELKIND_VIEW instead of RELKIND_RELATION.
Also, views no longer have actual heap storage
files.
The following changes were made
1. CREATE VIEW sets the new relkind
2. The executor complains if a DELETE or
INSERT references a view.
3. DROP RULE complains if an attempt is made
to delete a view SELECT rule.
4. CREATE RULE "_RETmytable" AS ON SELECT TO mytable DO INSTEAD ...
1. checks to make sure mytable is empty.
2. sets the relkind to RELKIND_VIEW.
3. deletes the heap storage files.
5. LOCK myview is not allowed. :)
6. the regression test type_sanity was changed to
account for the new relkind value.
7. CREATE INDEX ON myview ... is not allowed.
8. VACUUM myview is not allowed.
VACUUM automatically skips views when do the entire
database.
9. TRUNCATE myview is not allowed.
THINGS LEFT TO THINK ABOUT
o pg_views
o pg_dump
o pgsql (\d \dv)
o Do we really want to be able to inherit from views?
o Is 'DROP TABLE myview' OK?
--
Mark Hollomon
user is now defined in terms of the user id, the user name is only computed
upon request (for display purposes). This is kind of the opposite of the
previous state, which would maintain the user name and compute the user id
for permission checks.
Besides perhaps saving a few cycles (integer vs string), this now creates a
single point of attack for changing the user id during a connection, for
purposes of "setuid" functions, etc.
pass-by-ref data types --- eg, an index on lower(textfield) --- no longer
leak memory during index creation or update. Clean up a lot of redundant
code ... did you know that copy, vacuum, truncate, reindex, extend index,
and bootstrap each basically duplicated the main executor's logic for
extracting information about an index and preparing index entries?
Functional indexes should be a little faster now too, due to removal
of repeated function lookups.
CREATE INDEX 'opt_type' clause is deimplemented by these changes,
but I haven't removed it from the parser yet (need to merge with
Thomas' latest change set first).
Special handling of TOAST relations during VACUUM. TOAST relations
are vacuumed while the lock on the master table is still active.
The ANALYZE flag doesn't propagate to their vacuuming because the
toaster access routines allways use index access ignoring stats, so
why compute them at all.
Protection of TOAST relations against normal INSERT/UPDATE/DELETE
while offering SELECT for debugging purposes.
Jan
for details). It doesn't really do that much yet, since there are no
short-term memory contexts in the executor, but the infrastructure is
in place and long-term contexts are handled reasonably. A few long-
standing bugs have been fixed, such as 'VACUUM; anything' in a single
query string crashing. Also, out-of-memory is now considered a
recoverable ERROR, not FATAL.
Eliminate a large amount of crufty, now-dead code in and around
memory management.
Fix problem with holding off SIGTRAP, SIGSEGV, etc in postmaster and
backend startup.
discussion of 5/19/00). pg_index is now searched for indexes of a
relation using an indexscan. Moreover, this is done once and cached
in the relcache entry for the relation, in the form of a list of OIDs
for the indexes. This list is used by the parser and executor to drive
lookups in the pg_index syscache when they want to know the properties
of the indexes. Net result: index information will be fully cached
for repetitive operations such as inserts.
key call sites are changed, but most called functions are still oldstyle.
An exception is that the PL managers are updated (so, for example, NULL
handling now behaves as expected in plperl and plpgsql functions).
NOTE initdb is forced due to added column in pg_proc.
Hiroshi. ReleaseRelationBuffers now removes rel's buffers from pool,
instead of merely marking them nondirty. The old code would leave valid
buffers for a deleted relation, which didn't cause any known problems
but can't possibly be a good idea. There were several places which called
ReleaseRelationBuffers *and* FlushRelationBuffers, which is now
unnecessary; but there were others that did not. FlushRelationBuffers
no longer emits a warning notice if it finds dirty buffers to flush,
because with the current bufmgr behavior that's not an unexpected
condition. Also, FlushRelationBuffers will flush out all dirty buffers
for the relation regardless of block number. This ensures that
pg_upgrade's expectations are met about tuple on-row status bits being
up-to-date on disk. Lastly, tweak BufTableDelete() to clear the
buffer's tag so that no one can mistake it for being a still-valid
buffer for the page it once held. Formerly, the buffer would not be
found by buffer hashtable searches after BufTableDelete(), but it would
still be thought to belong to its old relation by the routines that
sequentially scan the shared-buffer array. Again I know of no bugs
caused by that, but it still can't be a good idea.
running gcc and HP's cc with warnings cranked way up. Signed vs unsigned
comparisons, routines declared static and then defined not-static,
that kind of thing. Tedious, but perhaps useful...
functions, which would lead to trouble with datatypes that paid attention
to the typelem or typmod parameters to these functions. In particular,
incorrect code in pg_aggregate.c explains the platform-specific failures
that have been reported in NUMERIC avg().
errors. VACUUM normally compacts the table back-to-front, and stops
as soon as it gets to a page that it has moved some tuples onto.
(This logic doesn't make for a complete packing of the table, but it
should be pretty close.) But the way it was checking whether it had
got to a page with some moved-in tuples was to look at whether the
current page was the same as the last page of the list of pages that
have enough free space to be move-in targets. And there was other
code that would remove pages from that list once they got full.
There was a kluge that prevented the last list entry from being
removed, but it didn't get the job done. Fixed by keeping a separate
variable that contains the largest block number into which a tuple
has been moved. There's no longer any need to protect the last element
of the fraged_pages list.
Also, fix NOTICE messages to describe elapsed user/system CPU time
correctly.
table owner in order to vacuum a table. This is mainly to prevent
denial-of-service attacks via repeated vacuums. Allow VACUUM to gather
statistics about system relations, except for pg_statistic itself ---
not clear that it's worth the trouble to make that case work cleanly.
Cope with possible tuple size overflow in pg_statistic tuples; I'm
surprised we never realized that could happen. Hold a couple of locks
a little longer to try to prevent deadlocks between concurrent VACUUMs.
There still seem to be some problems in that last area though :-(
parallel --- and, not incidentally, removing a common reason for needing
manual cleanup by the DB admin after a crash. Remove initial global
delete of pg_statistics rows in VACUUM ANALYZE; this was not only bad
for performance of other backends that had to run without stats for a
while, but it was fundamentally broken because it was done outside any
transaction. Surprising we didn't see more consequences of that.
Detect attempt to run VACUUM inside a transaction block. Check for
query cancel request before starting vacuum of each table. Clean up
vacuum's private portal storage if vacuum is aborted.
Make all system indexes unique.
Make all cache loads use system indexes.
Rename *rel to *relid in inheritance tables.
Rename cache names to be clearer.
* Buffer refcount cleanup (per my "progress report" to pghackers, 9/22).
* Add links to backend PROC structs to sinval's array of per-backend info,
and use these links for routines that need to check the state of all
backends (rather than the slow, complicated search of the ShmemIndex
hashtable that was used before). Add databaseOID to PROC structs.
* Use this to implement an interlock that prevents DESTROY DATABASE of
a database containing running backends. (It's a little tricky to prevent
a concurrently-starting backend from getting in there, since the new
backend is not able to lock anything at the time it tries to look up
its database in pg_database. My solution is to recheck that the DB is
OK at the end of InitPostgres. It may not be a 100% solution, but it's
a lot better than no interlock at all...)
* In ALTER TABLE RENAME, flush buffers for the relation before doing the
rename of the physical files, to ensure we don't get failures later from
mdblindwrt().
* Update TRUNCATE patch so that it actually compiles against current
sources :-(.
You should do "make clean all" after pulling these changes.
additional argument specifying the kind of lock to acquire/release (or
'NoLock' to do no lock processing). Ensure that all relations are locked
with some appropriate lock level before being examined --- this ensures
that relevant shared-inval messages have been processed and should prevent
problems caused by concurrent VACUUM. Fix several bugs having to do with
mismatched increment/decrement of relation ref count and mismatched
heap_open/close (which amounts to the same thing). A bogus ref count on
a relation doesn't matter much *unless* a SI Inval message happens to
arrive at the wrong time, which is probably why we got away with this
sloppiness for so long. Repair missing grab of AccessExclusiveLock in
DROP TABLE, ALTER/RENAME TABLE, etc, as noted by Hiroshi.
Recommend 'make clean all' after pulling this update; I modified the
Relation struct layout slightly.
Will post further discussion to pghackers list shortly.
neqsel now behave as per my suggestions in pghackers a few days ago.
selectivity for < > <= >= should work OK for integral types as well, but
still need work for nonintegral types. Since these routines have never
actually executed before :-(, this may result in some significant changes
in the optimizer's choices of execution plans. Let me know if you see
any serious misbehavior.
CAUTION: THESE CHANGES REQUIRE INITDB. pg_statistic table has changed.
/*
* Read above about cases when !ItemIdIsUsed(Citemid)
* (child item is removed)... Due to the fact that
* at the moment we don't remove unuseful part of
* update-chain, it's possible to get too old
* parent row here. Like as in the case which
* caused this problem, we stop shrinking here.
* I could try to find real parent row but want
* not to do it because of real solution will
* be implemented anyway, latter, and we are too
* close to 6.5 release. - vadim 06/11/99
*/
if (Ptp.t_data->t_xmax != tp.t_data->t_xmin)
...
and possibly for other cases too:
DO NOT cache status of transaction in unknown state
(i.e. non-committed and non-aborted ones)
Example:
T1 reads row updated/inserted by running T2 and cache T2 status.
T2 commits.
Now T1 reads a row updated by T2 and with HEAP_XMAX_COMMITTED
in t_infomask (so cached T2 status is not changed).
Now T1 EvalPlanQual gets updated row version without HEAP_XMIN_COMMITTED
-> TransactionIdDidCommit(t_xmin) and TransactionIdDidAbort(t_xmin)
return FALSE and T2 decides that t_xmin is not committed and gets
ERROR above.
It's too late to find more smart way to handle such cases and so
I just changed xact status caching and got rid TransactionIdFlushCache()
from code.
Changed: transam.c, xact.c, lmgr.c and transam.h - last three
just because of TransactionIdFlushCache() is removed.
2. heapam.c:
T1 marked a row for update. T2 waits for T1 commit/abort.
T1 commits. T3 updates the row before T2 locks row page.
Now T2 sees that new row t_xmax is different from xact id (T1)
T2 was waiting for. Old code did Assert here. New one goes to
HeapTupleSatisfiesUpdate. Obvious changes too.
3. Added Assert to vacuum.c
4. bufmgr.c: break
Assert(buf->r_locks == 0 && !buf->ri_lock)
into two Asserts.
2. varsup.c:ReadNewTransactionId(): don't read nextXid from disk -
this func doesn't allocate next xid, so ShmemVariableCache->nextXid
may be used (but GetNewTransactionId() must be called first).
3. vacuum.c: change elog(ERROR, "Child item....") to elog(NOTICE) -
this is not ERROR, proper handling is just not implemented, yet.
4. s_lock.c: increase S_MAX_BUSY by 2 times.
5. shmem.c:GetSnapshotData(): have to call ReadNewTransactionId()
_after_ SpinAcquire(ShmemIndexLock).
2. Get rid of locking when updating statistics in vacuum.
3. Use QuerySnapshot in COPY TO and call SetQuerySnashot
in main tcop loop before FETCH and COPY TO.
to save a little bit of backend startup time. This way, the first
backend started after a VACUUM will rebuild the init file with up-to-date
statistics for the critical system indexes.
2. Much faster btree tuples deletion in the case when first on page
index tuple is deleted (no movement to the left page(s)).
3. Remember blkno of new root page in BTPageOpaque of
left/right siblings when root page is splitted.
Ok. I made patches replacing all of "#if FALSE" or "#if 0" to "#ifdef
NOT_USED" for current. I have tested these patches in that the
postgres binaries are identical.
Here are two new patches for the Win32 support.
1) The patch based on the one from Hiroshi Inoue [Inoue@tpf.co.jp], to
load
Winsock.dll from libpq.dll.
2) A patch for psql.c to remove the call to WSAStartup(), since it is
not
required when it's done in libpq.dll.
I'm still looking for the possibility of having a crypt() function in
libpq.dll too, the same way getopt was included. Any chance of getting
this
before 6.4, or should we wait for the next one?
//Magnus
I don't know if this is really related to the initdb problem
discussion (haven't followed it enough). But seems so because
it fixes a damn problem during index tuple insertion on
CREATE TABLE into pg_attribute_relid_attnum_index.
Anyway - this bug was really hard to find. During startup the
relcache reads in some prepared information about index
strategies from a file and then reinitializes the function
pointers inside the scanKey data. But for sake it assumed
single attribute index tuples (hasn't that changed recently).
Thus not all the strategies scanKey entries where initialized
properly, resulting in invalid addresses for the btree
comparision functions.
With the patch at the end the regression tests passed
excellent except for the sanity_check that crashed at vacuum
and the misc test where the select unique1 from onek2 outputs
the two rows in different order.
Jan
no longer returns buffer pointer, can be gotten from scan;
descriptor; bootstrap can create multi-key indexes;
pg_procname index now is multi-key index; oidint2, oidint4, oidname
are gone (must be removed from regression tests); use System Cache
rather than sequential scan in many places; heap_modifytuple no
longer takes buffer parameter; remove unused buffer parameter in
a few other functions; oid8 is not index-able; remove some use of
single-character variable names; cleanup Buffer variables usage
and scan descriptor looping; cleaned up allocation and freeing of
tuples; 18k lines of diff;
As Bruce mentioned, this is due to the conflict among changes we made.
Included patches should fix the problem(I changed all MB to
MULTIBYTE). Please let me know if you have further problem.
P.S. I did not include pathces to configure and gram.c to save the
file size(configure.in and gram.y modified).
From: t-ishii@sra.co.jp
Attached are patches to enhance the multi-byte support. (patches are
against 7/18 snapshot)
* determine encoding at initdb/createdb rather than compile time
Now initdb/createdb has an option to specify the encoding. Also, I
modified the syntax of CREATE DATABASE to accept encoding option. See
README.mb for more details.
For this purpose I have added new column "encoding" to pg_database.
Also pg_attribute and pg_class are changed to catch up the
modification to pg_database. Actually I haved added pg_database_mb.h,
pg_attribute_mb.h and pg_class_mb.h. These are used only when MB is
enabled. The reason having separate files is I couldn't find a way to
use ifdef or whatever in those files. I have to admit it looks
ugly. No way.
* support for PGCLIENTENCODING when issuing COPY command
commands/copy.c modified.
* support for SQL92 syntax "SET NAMES"
See gram.y.
* support for LATIN2-5
* add UNICODE regression test case
* new test suite for MB
New directory test/mb added.
* clean up source files
Basic idea is to have MB's own subdirectory for easier maintenance.
These are include/mb and backend/utils/mb.
1. Removes the unnecessary "#define AbcRegProcedure 123"'s from
pg_proc.h.
2. Changes those #defines to use the names already defined in
fmgr.h.
3. Forces the make of fmgr.h in backend/Makefile instead of having
it
made as a dependency in access/common/Makefile *hack*hack*hack*
4. Rearranged the #includes to a less helter-skelter arrangement,
also
changing <file.h> to "file.h" to signify a non-system header.
5. Removed "pg_proc.h" from files where its only purpose was for
the
#defines removed in item #1.
6. Added "fmgr.h" to each file changed for completeness sake.
Turns out that #6 was not necessary for some files because fmgr.h
was being included in a roundabout way SIX levels deep by the first
include.
"access/genam.h"
->"access/relscan.h"
->"utils/rel.h"
->"access/strat.h"
->"access/skey.h"
->"fmgr.h"
So adding fmgr.h really didn't add anything to the compile, hopefully
just made it clearer to the programmer.
S Darren.
Patch by: wieck@sapserv.debis.de (Jan Wieck)
One of the design rules of PostgreSQL is extensibility. And
to follow this rule means (at least for me) that there should
not only be a builtin PL. Instead I would prefer a defined
interface for PL implemetations.
Makefile.global.
End result, if all goes well, should allow for much easier porting, since
there will no longer be a concept of a "port". Most, if not everything,
*should* be determined by configure, or by the compiler itself. Still
work to be done though :)
async.c: #include <port-protos.h> surrounded by an #ifdef HAVE_STRDUP
vacuum.c: #include <port-protos.h> commented out...can someone comment as
to why it was included, as it doesn't seem to have any effect
under FreeBSD so far...would like some sort of #ifdef wrapper
like async.c if possible
start time equal to tuple->t_tmax.
Privent shrinking if there are tuples modifyed by running transactions
(it concerns system relations only, currently).
table. The table name is de-allocated by the CommitTransactionCommand()
in vc_init() before it is copied in VacRel.data and sometimes this causes
a SIGSEGV. My patch simply moves the strcpy before vc_init.
Submitted by Massimo Dal Zotto <dz@cs.unitn.it>.
2. Reap unused tuples too
3. Reap empty pages
4. Check if a page is initialized, initialize it if not
and reap it
5. Binary search in list of reapped pages/tids to check
is the heap' tid pointed by a index' tuple on this list
(it's mu-u-uch faster)