From 9e241829522c4cebf53ec3389c7f031234282003 Mon Sep 17 00:00:00 2001 From: Bruce Momjian Date: Fri, 18 Jun 2004 16:04:13 +0000 Subject: [PATCH] Remove tablespaces TODO.detail. --- doc/TODO.detail/tablespaces | 11732 ---------------------------------- 1 file changed, 11732 deletions(-) delete mode 100644 doc/TODO.detail/tablespaces diff --git a/doc/TODO.detail/tablespaces b/doc/TODO.detail/tablespaces deleted file mode 100644 index 4c1351aba7..0000000000 --- a/doc/TODO.detail/tablespaces +++ /dev/null @@ -1,11732 +0,0 @@ -From pgsql-hackers-owner+M174@hub.org Sun Mar 12 22:31:11 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA25886 - for ; Sun, 12 Mar 2000 23:31:10 -0500 (EST) -Received: from news.tht.net (news.hub.org [216.126.91.242]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA04589 for ; Sun, 12 Mar 2000 23:19:33 -0500 (EST) -Received: from hub.org (hub.org [216.126.84.1]) - by news.tht.net (8.9.3/8.9.3) with SMTP id XAA42854; - Sun, 12 Mar 2000 23:05:05 -0500 (EST) - (envelope-from pgsql-hackers-owner+M174@hub.org) -Received: from candle.pha.pa.us (root@s5-03.ppp.op.net [209.152.195.67]) - by hub.org (8.9.3/8.9.3) with ESMTP id XAA95917 - for ; Sun, 12 Mar 2000 23:00:56 -0500 (EST) - (envelope-from pgman@candle.pha.pa.us) -Received: (from pgman@localhost) - by candle.pha.pa.us (8.9.0/8.9.0) id WAA25403 - for pgsql-hackers@postgreSQL.org; Sun, 12 Mar 2000 22:59:56 -0500 (EST) -From: Bruce Momjian -Message-Id: <200003130359.WAA25403@candle.pha.pa.us> -Subject: [HACKERS] Fix for RENAME -To: PostgreSQL-development -Date: Sun, 12 Mar 2000 22:59:56 -0500 (EST) -X-Mailer: ELM [version 2.4ME+ PL72 (25)] -MIME-Version: 1.0 -Content-Type: text/plain; charset=US-ASCII -Content-Transfer-Encoding: 7bit -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -I have thought about the issue with ALTER TABLE RENAME and keeping the -file system in sync with the database. - -It seems there are three commands that can cause these to get out of -sync: - - CREATE TABLE/INDEX - DROP TABLE/INDEX - ALTER TABLE RENAME - -Now, if we had file names based only on the oid, we can eliminate file -renaming for RENAME, but the others are still a problem. - -Seems there are three ways to get out of sync: - - ABORT transaction - backend crash - OS crash - -The last two are the same, except the backend crash restarts the -postmaster, while the OS crash has the postmaster starting up normally. - -Here is my idea. Create a C List of file names to unlink on transaction -commit or abort. For CREATE, unlink created files on transaction ABORT. -For DROP, unlink dropped files on COMMIT. For RENAME, create a hard -link for the new table linked to old table, and unlink the old file name -on COMMIT or the new file on ABORT. - -That takes care of COMMIT and ABORT. For backend crash or OS crash, add -a postgres command-line flag for recovery. Have the postmaster on -startup or shared memory refresh start up a postgres backend on every -database with the recovery flag set. Have the postgres backend find all -the oids in the pg_class table, and have it go through every file in the -database directory and remove all files that don't match the oids/names -in pg_class. Also, remove all old sort, noname, and temp files at the -same time. Seems we should be doing this anyway. - -Care would have to be taken that a corrupted database that caused a -postgres crash on connection would not get the postmaster startup into -an infinite loop. - -Comments? - --- - Bruce Momjian | http://www.op.net/~candle - pgman@candle.pha.pa.us | (610) 853-3000 - + If your life is a hard drive, | 830 Blythe Avenue - + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 - -From reedstrm@wallace.ece.rice.edu Tue Mar 14 12:33:31 2000 -Received: from wallace.ece.rice.edu (root@wallace.ece.rice.edu [128.42.12.154]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA23826 - for ; Tue, 14 Mar 2000 13:33:29 -0500 (EST) -Received: by wallace.ece.rice.edu - via sendmail from stdin - id (Debian Smail3.2.0.102) - for pgman@candle.pha.pa.us; Tue, 14 Mar 2000 12:33:32 -0600 (CST) -Date: Tue, 14 Mar 2000 12:33:32 -0600 -From: "Ross J. Reedstrom" -To: Hiroshi Inoue -Cc: Bruce Momjian , - PostgreSQL-development -Subject: Re: [HACKERS] Fix for RENAME -Message-ID: <20000314123331.A6094@rice.edu> -References: <200003140317.WAA27733@candle.pha.pa.us> <000c01bf8d75$a0016800$2801007e@tpf.co.jp> -Mime-Version: 1.0 -Content-Type: text/plain; charset=us-ascii -User-Agent: Mutt/1.0i -In-Reply-To: <000c01bf8d75$a0016800$2801007e@tpf.co.jp>; from Inoue@tpf.co.jp on Tue, Mar 14, 2000 at 02:24:52PM +0900 -Status: RO - -Hiroshi - -I've just about finished working up a patch to store the physical -file name in the pg_class table. There are only two places that -require a Rule for generating the filename, and one of them is -only used for bootstrapping. For the initial cut, I used the rule: - -The filename consists of the TABLENAME, and underscore, and the OID. -If this is longer than NAMEDATALEN, shorten the TABLENAME. - -I implemented this rule by exporting Tom's makeObjectName function -from analyze.c, which is used to make other system generated names -that are have a requirement to be human readable. Replacing this -rule with any other in the future would be straightforward, except -for bootstrap. There are a number of places in bootstrap that need to -know the filename. I've factored them out into yet another set of -#defines (in catname.h) to make that easier. - - -I'm working through the regression tests right now: this is a relatively -extensive change, since it modifies the low level access routines, and the -buffer cache (which I indexed on physical filename, rather than relname, -as it is now) Hopefully, I caught all the places that assume relname == -filename == unique name within a single database (see, I want schemas...) - -Ross --- -Ross J. Reedstrom, Ph.D., -NSBRI Research Scientist/Programmer -Computer and Information Technology Institute -Rice University, 6100 S. Main St., Houston, TX 77005 - - - - - -On Tue, Mar 14, 2000 at 02:24:52PM +0900, Hiroshi Inoue wrote: -> > -----Original Message----- -> > From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] -> > -> > > > They use the existing table file. It is only when -> > > > adding/removing/renaming file system files that this -> > out-of-sync problem -> > > > happens. -> > > > -> > -> > Not sure. I was going to get the CREATE/DROP/RENAME working as it -> > should then as we add more features, we can implement this solution for -> > them too. -> > -> -> Hmm,is general solution difficult ? -> Is more flexible naming rule bad ? -> -> This the 3rd or 4th time that I mention the following. -> -> PostgreSQL doesn't keep the information in itself where tables are -> allocated. So we need a naming rule to find where existent tables -> are allocated. Don't you wonder the spec ? -> -> Regards. -> -> Hiroshi Inoue -> Inoue@tpf.co.jp -> -> - -From pgsql-hackers-owner+M74@hub.org Tue Mar 14 18:14:15 2000 -Received: from hub.org (hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA06093 - for ; Tue, 14 Mar 2000 19:14:13 -0500 (EST) -Received: from hub.org (hub.org [216.126.84.1]) - by hub.org (8.9.3/8.9.3) with SMTP id SAA95465; - Tue, 14 Mar 2000 18:45:35 -0500 (EST) - (envelope-from pgsql-hackers-owner+M74@hub.org) -Received: from wallace.ece.rice.edu (root@wallace.ece.rice.edu [128.42.12.154]) - by hub.org (8.9.3/8.9.3) with ESMTP id NAA31276 - for ; Tue, 14 Mar 2000 13:33:52 -0500 (EST) - (envelope-from reedstrm@wallace.ece.rice.edu) -Received: by wallace.ece.rice.edu - via sendmail from stdin - id (Debian Smail3.2.0.102) - for pgsql-hackers@postgresql.org; Tue, 14 Mar 2000 12:33:32 -0600 (CST) -Date: Tue, 14 Mar 2000 12:33:32 -0600 -From: "Ross J. Reedstrom" -To: Hiroshi Inoue -Cc: Bruce Momjian , - PostgreSQL-development -Subject: Re: [HACKERS] Fix for RENAME -Message-ID: <20000314123331.A6094@rice.edu> -References: <200003140317.WAA27733@candle.pha.pa.us> <000c01bf8d75$a0016800$2801007e@tpf.co.jp> -Mime-Version: 1.0 -Content-Type: text/plain; charset=us-ascii -User-Agent: Mutt/1.0i -In-Reply-To: <000c01bf8d75$a0016800$2801007e@tpf.co.jp>; from Inoue@tpf.co.jp on Tue, Mar 14, 2000 at 02:24:52PM +0900 -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Hiroshi - -I've just about finished working up a patch to store the physical -file name in the pg_class table. There are only two places that -require a Rule for generating the filename, and one of them is -only used for bootstrapping. For the initial cut, I used the rule: - -The filename consists of the TABLENAME, and underscore, and the OID. -If this is longer than NAMEDATALEN, shorten the TABLENAME. - -I implemented this rule by exporting Tom's makeObjectName function -from analyze.c, which is used to make other system generated names -that are have a requirement to be human readable. Replacing this -rule with any other in the future would be straightforward, except -for bootstrap. There are a number of places in bootstrap that need to -know the filename. I've factored them out into yet another set of -#defines (in catname.h) to make that easier. - - -I'm working through the regression tests right now: this is a relatively -extensive change, since it modifies the low level access routines, and the -buffer cache (which I indexed on physical filename, rather than relname, -as it is now) Hopefully, I caught all the places that assume relname == -filename == unique name within a single database (see, I want schemas...) - -Ross --- -Ross J. Reedstrom, Ph.D., -NSBRI Research Scientist/Programmer -Computer and Information Technology Institute -Rice University, 6100 S. Main St., Houston, TX 77005 - - - - - -On Tue, Mar 14, 2000 at 02:24:52PM +0900, Hiroshi Inoue wrote: -> > -----Original Message----- -> > From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] -> > -> > > > They use the existing table file. It is only when -> > > > adding/removing/renaming file system files that this -> > out-of-sync problem -> > > > happens. -> > > > -> > -> > Not sure. I was going to get the CREATE/DROP/RENAME working as it -> > should then as we add more features, we can implement this solution for -> > them too. -> > -> -> Hmm,is general solution difficult ? -> Is more flexible naming rule bad ? -> -> This the 3rd or 4th time that I mention the following. -> -> PostgreSQL doesn't keep the information in itself where tables are -> allocated. So we need a naming rule to find where existent tables -> are allocated. Don't you wonder the spec ? -> -> Regards. -> -> Hiroshi Inoue -> Inoue@tpf.co.jp -> -> - -From mascarm@mascari.com Tue Mar 14 16:34:04 2000 -Received: from corvette.mascari.com (dhcp26136016.columbus.rr.com [24.26.136.16]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA04395 - for ; Tue, 14 Mar 2000 17:32:14 -0500 (EST) -Received: from mascari.com (ferrari.mascari.com [192.168.2.1]) - by corvette.mascari.com (8.9.3/8.9.3) with ESMTP id RAA09562; - Tue, 14 Mar 2000 17:27:22 -0500 -Message-ID: <38CEBD0A.52ADB37E@mascari.com> -Date: Tue, 14 Mar 2000 17:28:26 -0500 -From: Mike Mascari -X-Mailer: Mozilla 4.7 [en] (Win95; I) -X-Accept-Language: en -MIME-Version: 1.0 -To: Bruce Momjian -CC: Hiroshi Inoue , - PostgreSQL-development -Subject: Re: [HACKERS] Fix for RENAME -References: <200003141545.KAA17518@candle.pha.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: RO - -Bruce Momjian wrote: -> -> > Hmm,is general solution difficult ? -> > Is more flexible naming rule bad ? -> > -> > This the 3rd or 4th time that I mention the following. -> -> That's because I didn't understand. -> -> > -> > PostgreSQL doesn't keep the information in itself where tables are -> > allocated. So we need a naming rule to find where existent tables -> > are allocated. Don't you wonder the spec ? -> -> How does naming the files in the database help our DROP/CREATE problem? -> It would help RENAME a little bit. Not sure about the others because -> currently they don't have a problem. - -I've been thinking about this somewhat, and I think the first -step necessary in correctly supporting ROLLBACK-able DDL -statements in transactions is the change to _. -Imagine the scenario: - -CREATE TABLE test (key int4); - -a) Session #1: - -BEGIN; - -b) Session #2: - -BEGIN; -DROP TABLE test; -CREATE TABLE test (value varchar(32)); - -c) Session #1: - -DROP TABLE test; -COMMIT; - -d) Session #2: - -COMMIT; - -What's clear to me is that, if DDL statements are to be -ROLLBACK-able, either (1) an AccessExclusive lock is held on the -relation until transaction commit (like Phillip Warner stated was -Dec/Rdb's behavior) or (2) PostgreSQL must be capable of -supporting "multi-versioned schema" as well as tuples. Before -step 'c' is executed, both tables must simultaneously exist in -the database with the same name, which works fine in the cataloge -thanks to MVCC, but requires that, on disk, there exists: - -test_01231 - Session #1's table, available for ROLLBACK -test_13421 - Session #2's table, available for COMMIT - -Now, I believe it was Andreas who suggested that VACUUM be -modified to perform cleanup. I agree with this. VACUUM will need -to check for aborted relation tuples in pg_class and remove the -associated file from the filesystem in the event, for example, -that Session #2 aborted -or- Session #1 aborted leaving the -original pg_class tuple the "active" one and Session #2 attempted -to COMMIT, which violates the UNIQUE constraint on the relname of -pg_class. In addition, for "active" relation entries, VACUUM -should verify the filename is -_ for the given oid. If it is not, it should rename -the filename on the filesystem. Again, this is purely cosmetic -for administrative purposes only, but would allow -for lack of atomicity only with respect to the label of the -relation file, until the next -VACUUM is run. - -For the case of ALTER TABLE RENAME, ALTER TABLE DROP COLUMN, -etc., the same functionality would apply. But, as in previous -discussions regarding ALTER TABLE DROP COLUMN, PostgreSQL MUST be -capable of allowing multiple tuples with different attribute -counts and types within the same relation: - -CREATE TABLE test (key int4); - -a) Session #1: - -BEGIN; - -b) Session #2: - -BEGIN; -ALTER TABLE test ADD COLUMN value int4; -INSERT INTO test values (1, 1); - -c) Session #1: - -INSERT INTO test values (0); -COMMIT; - -d) Session #2: - -COMMIT; - -This also means that Hiroshi's plan to suppress the visibility of -attributes for ALTER TABLE DROP COLUMN would be required anyway, -to allow for "multi-versioning" of attributes within a single -tuple (i.e., like multi-versioning of tuples within relations), -an attribute is either visible or not, but the tuple should -always grow, until, of course, the next VACUUM. - -So, to support rollback-able DDL statements ("multi-versioning -schema", if you will), PostgreSQL needs: - -1) relation names of the form _ -2) support "multi-versioning" of attributes within a single tuple -3) modify VACUUM to: - - A) Remove filesystem files whose pg_class tuples are no longer -valid - B) Rename filesystem files to relname of pg_class when the -_ doesn't match - C) Reconstruct relations after attributes have been -added/dropped. - -4) All DDL statements should perform their non-create filesystem -functions in the now infamous "post-transaction-commit" trigger. -If the backend should crash between the time the transaction -committed and the rename() or unlink(), no adverse affects would -be encountered with the database WRT data, VACUUM would clean up -the rename() problem, and, worst-case scenario, an old -_ file would lie around unused. But at least it -would no longer prohibit the creation of a table by the same -name.... - -Just my humble opinion, - -Mike Mascari - -From Inoue@tpf.co.jp Tue Mar 14 20:31:35 2000 -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA08792 - for ; Tue, 14 Mar 2000 21:30:35 -0500 (EST) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id LAA00515; Wed, 15 Mar 2000 11:29:09 +0900 -From: "Hiroshi Inoue" -To: "Ross J. Reedstrom" , - "Bruce Momjian" -Cc: "PostgreSQL-development" -Subject: RE: [HACKERS] Fix for RENAME -Date: Wed, 15 Mar 2000 11:35:46 +0900 -Message-ID: <000c01bf8e27$2b3c3ce0$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -In-Reply-To: <20000314123331.A6094@rice.edu> -Importance: Normal -Status: ROr - -> -----Original Message----- -> From: Ross J. Reedstrom [mailto:reedstrm@wallace.ece.rice.edu] -> -> Hiroshi - -> I've just about finished working up a patch to store the physical -> file name in the pg_class table. There are only two places that -> require a Rule for generating the filename, and one of them is -> only used for bootstrapping. - -Thanks for your trial. -It's nice that only two places require naming rule. - -I don't stick to one naming rule. -The only limitation is the uniqueness and the rule -could be changed according to situations. -For example,we could change the naming rule according to -the kind of relation such as system/user relations. - -I'm now inclined to introduce a new system relation to store -the physical path name. It could also have table(data)space -information in the (near ?) future. -It seems better to separate it from pg_class because table(data?) -space may change the concept of table allocation. - -Comments ? - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - -From Inoue@tpf.co.jp Wed Mar 15 02:00:58 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA17887 - for ; Wed, 15 Mar 2000 03:00:57 -0500 (EST) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id CAA02974 for ; Wed, 15 Mar 2000 02:54:44 -0500 (EST) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id QAA00734; Wed, 15 Mar 2000 16:53:56 +0900 -From: "Hiroshi Inoue" -To: "Bruce Momjian" -Cc: "Ross J. Reedstrom" , - "PostgreSQL-development" -Subject: RE: [HACKERS] Fix for RENAME -Date: Wed, 15 Mar 2000 17:00:35 +0900 -Message-ID: <001101bf8e54$8b941cc0$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -In-Reply-To: <200003150433.XAA13256@candle.pha.pa.us> -Importance: Normal -Status: ROr - -> -----Original Message----- -> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] -> -> > I'm now inclined to introduce a new system relation to store -> > the physical path name. It could also have table(data)space -> > information in the (near ?) future. -> > It seems better to separate it from pg_class because table(data?) -> > space may change the concept of table allocation. -> -> Why not just put it in pg_class? -> - -Not sure,it's only my feeling. -Comments please,everyone. - -We have taken a practical way which doesn't break file per table -assumption in this thread and it wouldn't so difficult to implement. -In fact Ross has already tried it. - -However there was a discussion about data(table)space for -months ago and currently a new discussion is there. -Judging from the previous discussion,I can't expect so much -that it could get a practical consensus(How many opinions there -were). We can make a practical step toward future by encapsulating -the information of table allocation. Separating table alloc info from -pg_class seems one of the way. -There may be more essential things for encapsulation. - -Comments ? - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - -From pgsql-hackers-owner+M196@hub.org Thu Mar 16 03:02:35 2000 -Received: from hub.org (hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA05789 - for ; Thu, 16 Mar 2000 04:02:29 -0500 (EST) -Received: from hub.org (hub.org [216.126.84.1]) - by hub.org (8.9.3/8.9.3) with SMTP id CAA27302; - Thu, 16 Mar 2000 02:58:55 -0500 (EST) - (envelope-from pgsql-hackers-owner+M196@hub.org) -Received: from downtown.oche.de (root@downtown.oche.de [194.94.253.3]) - by hub.org (8.9.3/8.9.3) with ESMTP id CAA23907 - for ; Thu, 16 Mar 2000 02:37:54 -0500 (EST) - (envelope-from mne@darwin.oche.de) -Received: from darwin.oche.de (uucp@localhost) - by downtown.oche.de (8.9.3/8.9.3/Debian/GNU) with SMTP id IAA30654 - for ; Thu, 16 Mar 2000 08:40:04 +0100 -Received: from mne by darwin.oche.de with local (Exim 3.12 #1 (Debian)) - id 12VUhX-0003Vz-00 - for ; Thu, 16 Mar 2000 08:28:11 +0100 -Date: Thu, 16 Mar 2000 08:28:11 +0100 (CET) -From: Martin Neumann -Subject: [HACKERS] RfD: Design of tablespaces -To: pgsql-hackers@postgresql.org -MIME-Version: 1.0 -Content-Type: TEXT/plain; CHARSET=US-ASCII -Message-Id: -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - - -I have written some thoughts on the concept of tablespace -down. I would be happy to get some comments on it. - ------------------------------------------------------------------ - Implementation of tablespaces within PostgreSQL -- a brainstorming paper designed for general discussion - - -by Martin Neumann, 2000/3/15 - - -1. What are tablespaces? -------------------------- - -Tablespaces make it possible to distribute storage objects -over multiple points of storage (POS). Therefor one could -say a tablespace can be a POS. - -Example: - -tablespace_a -----> /mnt/raid/arena0/ -tablespace_b -----> /mnt/raid/emc0/ - -Tablespaces can also store their data on other tablespaces: - -tablespace_c -----> tablespace_b - -This is quite interessting for administration purposes. - - -2. What are its advantages? ----------------------------- - -As you can choose a different tablespace for every storage -object (table, index etc.) it is easy to improve the following -aspects of your system: - - - Reliability - - You can put storage objects (mostly tables) you strongly depend - on onto a more reliable tablespace (mirrored RAID or perhaps - simply a directory which gets backuped more often than others). - - - Speed - - You can put storage objects you rarely need onto a rather slow - tablespace and keep your quick tablespaces clean from this. - - A fast, but more expensive RAID-Stripeset can be used more - efficiently as it doesn't get filled with non-performance - sensitive data. - - But also distributing storage objects which have equal needs - in sense of speed onto different tablespaces makes sense as - you gain more speed by distributing data over more than one - harddisk spindle. - - - Manageability - - You can grant and revoke rights on base of a tablespace. - - As every storage object belongs to exactly one tablespace, - you can easily group storage objects using a tablespace. - - -3. What about disk I/O? ------------------------- - -Tablespaces tell the storage manager only where to store -the data, not how. This is the reasonable way. - - -4. Usage ---------- - -CREATE TABLESPACE tsname TYPE storage_type storage_options - -Examples: - -CREATE TABLESPACE tsemc0 - TYPE classic DIRECTORY /mnt/raid/emc0 NOFSYNC - -CREATE TABLESPACE tsarena0 TYPE raw DEVICE /dev/araid/0 - MINSIZE 128 MAXSIZE 4096 GROW 4 32 SHRINK 2 32 - BLOCKSIZE 16384 - -CREATE TABLESPACE quick0 TYPE link TABLESPACE tsarena0; - --- - -CREATE TABLE tbname ( ... ) TABLESPACE tsname; - -Examples: - -CREATE TABLE foo ( - id int4 NOT NULL UNIQUE, - name text NOT NULL -) TABLESPACE tsemc0; - -CREATE TABLE bar ( - id int4 NOT NULL UNIQUE, - name text NOT NULL -) TABLESPACE default; - -If the tablespace isn't given, the storage objects gets created -in the "default" tablespace. - -"default" is the PostgreSQL's default tablespace and the only one -which has to exist on each system. - --- - -ALTER TABLESPACE tsname tssettings - -Examples: - -ALTER TABLESPACE tsemc0 DIRECTORY /mnt/raid/emc1 - - -NOTE: altering tablespaces without recreating the contained -storage objects introduces many problems. -Realisation is difficult and won't be my first goal. - --- - -DROP TABLESPACE tsname [FORCE] - -Examples: - -DROP TABLESPACE tsarena0 - -This will immediately remove the tablespace tsarena0 -if it contains no storage objects. - -If it still contains some the tablespace is marked for -deletion. - -This means: -1. you can't create new storage objects in the tablespace -2. if the last storage object inside gets dropped, the - tablespace will be removed. - - -DROP TABLESPACE tsarena0 FORCE - -This will remove the tablespace including all contained -storage objects immediately. - --- - -VACUUM tsname - -Example: - -VACUUM tsemc1 - -This will vacuum a single tablespace with all contained -storage objects. ------------------------------------------------------------------ - --- -Martin Neumann, Welkenrather Str. 118c, 52074 Aachen, Germany -mne@mne.de - http://www.mne.de/mne/ - sms@mne.de [eMail2SMS] -Tel. 0241 / 8876-080 - Mobil: 0173 / 27 69 632 -..------.--------------------------------------------------------- -| at | Inform GmbH - Abteilung Airport Logistics -| work | Pascalstr. 23 - 52076 Aachen - Tel. 02408 / 9456-0 -|______| martin.neumann@inform-ac.com - http://www.inform-ac.com - - - -From JanWieck@t-online.de Wed Jun 14 19:01:01 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA21372 - for ; Wed, 14 Jun 2000 19:00:59 -0400 (EDT) -Received: from mailout02.sul.t-online.com (mailout02.sul.t-online.com [194.25.134.17]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id SAA01930 for ; Wed, 14 Jun 2000 18:51:11 -0400 (EDT) -Received: from fwd01.sul.t-online.de - by mailout02.sul.t-online.com with smtp - id 132Lz6-0004ec-01; Thu, 15 Jun 2000 00:50:08 +0200 -Received: from hot.jw.home (340000654369-0001@[62.224.107.172]) by fwd01.sul.t-online.de - with esmtp id 132Lyy-0tYyi9C; Thu, 15 Jun 2000 00:50:00 +0200 -Received: (from wieck@localhost) - by hot.jw.home (8.8.5/8.8.5) id WAA07887; - Wed, 14 Jun 2000 22:43:39 +0200 -From: JanWieck@t-online.de (Jan Wieck) -Message-Id: <200006142043.WAA07887@hot.jw.home> -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <14752.960996980@sss.pgh.pa.us> from Tom Lane at "Jun 14, 2000 11:36:20 - am" -To: Tom Lane -Date: Wed, 14 Jun 2000 22:43:39 +0200 (MEST) -CC: Oliver Elphick , Bruce Momjian , - PostgreSQL-development -Reply-To: Jan Wieck -X-Mailer: ELM [version 2.4ME+ PL68 (25)] -MIME-Version: 1.0 -Content-Type: text/plain; charset=US-ASCII -Content-Transfer-Encoding: 7bit -X-Sender: 340000654369-0001@t-dialin.net -Status: ROr - -Tom Lane wrote: -> "Oliver Elphick" writes: -> > I suggest that DROP TABLE in a transaction should not be allowed. -> -> I had actually made it do that for a short time early this year, -> and was shouted down. On reflection I have to agree; it's too useful -> to be able to do -> -> begin; -> drop table foo; -> create table foo(new schema); -> ... -> end; -> -> You do indeed lose big if you suffer an error partway through, but -> the answer to that is to fix our file naming conventions so that we -> can support rollback of drop table. - - Belongs IMHO to the discussion to keep separate what is - separate (having indices/toast-relations/etc. in separate - directories and whatnot). - - I've never been really happy with the file naming - conventions. The need of a filesystem entry to have the same - name of the DB object that is associated with it isn't right. - I know, some people love to be able to easily identify the - files with ls(1). OTOH what is that good for? - - Well, someone can easily see how big the disk footprint of - his data is. Whow - what an info. Anything else? - - Why not changing the naming to be something like this: - - /catalog_tables/pg_... - /catalog_index/pg_... - /user_tables/oid_... - /user_index/oid_... - /temp_tables/oid_... - /temp_index/oid_... - /toast_tables/oid_... - /toast_index/oid_... - /whatnot_???/... - - This way, it would be much easier to separate all the - different object types to different physical media. We would - loose some transparency, but I've allways wondered what - people USE that for (except for just wanna know). For - convinience we could implement another little utility that - tells the object size like - - DESCRIBE TABLE/VIEW/whatnot - - that returns the physical location and storage details of the - object. And psql could use it to print this info additional - on the \d commands. Would give unprivileged users access to - this info, so be it, it's not a security issue IMHO. - - The subdirectory an object goes into has to be controlled by - the relkind. So we need to tidy up that a little too. I think - it's worth it. - - The objects storage location (the bare file) now would - contain the OID. So we avoid naming conflicts for temp - tables, naming conflicts during DROP/CREATE in a transaction - and all the like. - - Comments? - - -Jan - --- - -#======================================================================# -# It's easier to get forgiveness for being wrong than for being right. # -# Let's break this rule - forgive me. # -#================================================== JanWieck@Yahoo.com # - - - -From tgl@sss.pgh.pa.us Wed Jun 14 22:06:54 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA02821 - for ; Wed, 14 Jun 2000 22:06:52 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16609; - Wed, 14 Jun 2000 22:07:16 -0400 (EDT) -To: Jan Wieck -cc: Oliver Elphick , Bruce Momjian , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006142043.WAA07887@hot.jw.home> -References: <200006142043.WAA07887@hot.jw.home> -Comments: In-reply-to JanWieck@t-online.de (Jan Wieck) - message dated "Wed, 14 Jun 2000 22:43:39 +0200" -Date: Wed, 14 Jun 2000 22:07:15 -0400 -Message-ID: <16606.961034835@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -JanWieck@t-online.de (Jan Wieck) writes: -> I've never been really happy with the file naming -> conventions. The need of a filesystem entry to have the same -> name of the DB object that is associated with it isn't right. -> I know, some people love to be able to easily identify the -> files with ls(1). OTOH what is that good for? - -I agree with Jan on this: let's just change the file names over to -be OIDs. Then we can have rollbackable DROP and RENAME TABLE easily. -Naming the files after the logical names of the tables is nice if it -doesn't cost anything, but it is *not* worth the trouble to preserve -a relationship between filename and tablename when it is costing us. -And it's costing us big time. That single feature is hurting us on -functionality, robustness, and portability, and for what benefit? -Not nearly enough. It's time to just let go of it. - -> Why not changing the naming to be something like this: - -> /catalog_tables/pg_... -> /catalog_index/pg_... -> /user_tables/oid_... -> /user_index/oid_... -> /temp_tables/oid_... -> /temp_index/oid_... -> /toast_tables/oid_... -> /toast_index/oid_... -> /whatnot_???/... - -I don't see a lot of value in that. Better to do something like -tablespaces: - - // - - regards, tom lane - -From tgl@sss.pgh.pa.us Wed Jun 14 22:20:59 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA25561 - for ; Wed, 14 Jun 2000 22:20:56 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16708; - Wed, 14 Jun 2000 22:21:30 -0400 (EDT) -To: Bruce Momjian -cc: Jan Wieck , Oliver Elphick , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006142313.TAA22904@candle.pha.pa.us> -References: <200006142313.TAA22904@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 14 Jun 2000 19:13:47 -0400" -Date: Wed, 14 Jun 2000 22:21:30 -0400 -Message-ID: <16705.961035690@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Bruce Momjian writes: -> You need something that works from the command line, and something that -> works if PostgreSQL is not running. How would you restore one file from -> a tape. - -"Restore one file from a tape"? How are you going to do that anyway? -You can't save and restore portions of a database like that, because -of transaction commit status problems. To restore table X correctly, -you'd have to restore pg_log as well, and then your other tables are -hosed --- unless you also restore all of them from the backup. Only -a complete database restore from tape would work, and for that you -don't need to tell which file is which. So the above argument is a -red herring. - -I realize it's nice to be able to tell which table file is which by -eyeball, but the price we are paying for that small convenience is -just too high. Give that up, and we can have rollbackable DROP and -RENAME now (I'll personally commit to making it happen for 7.1). -Continue to insist on it, and I don't think we'll *ever* have those -features in a really robust form. It's just not possible to do -multiple file renames atomically. - - regards, tom lane - -From pgsql-hackers-owner+M3381@hub.org Wed Jun 14 22:23:25 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA05943 - for ; Wed, 14 Jun 2000 22:23:24 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5F2ME840721; - Wed, 14 Jun 2000 22:22:14 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5F2Le840155 - for ; Wed, 14 Jun 2000 22:21:41 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16708; - Wed, 14 Jun 2000 22:21:30 -0400 (EDT) -To: Bruce Momjian -cc: Jan Wieck , Oliver Elphick , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006142313.TAA22904@candle.pha.pa.us> -References: <200006142313.TAA22904@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 14 Jun 2000 19:13:47 -0400" -Date: Wed, 14 Jun 2000 22:21:30 -0400 -Message-ID: <16705.961035690@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: ROr - -Bruce Momjian writes: -> You need something that works from the command line, and something that -> works if PostgreSQL is not running. How would you restore one file from -> a tape. - -"Restore one file from a tape"? How are you going to do that anyway? -You can't save and restore portions of a database like that, because -of transaction commit status problems. To restore table X correctly, -you'd have to restore pg_log as well, and then your other tables are -hosed --- unless you also restore all of them from the backup. Only -a complete database restore from tape would work, and for that you -don't need to tell which file is which. So the above argument is a -red herring. - -I realize it's nice to be able to tell which table file is which by -eyeball, but the price we are paying for that small convenience is -just too high. Give that up, and we can have rollbackable DROP and -RENAME now (I'll personally commit to making it happen for 7.1). -Continue to insist on it, and I don't think we'll *ever* have those -features in a really robust form. It's just not possible to do -multiple file renames atomically. - - regards, tom lane - -From pgsql-hackers-owner+M3382@hub.org Wed Jun 14 22:31:42 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA10091 - for ; Wed, 14 Jun 2000 22:31:41 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5F2UI853244; - Wed, 14 Jun 2000 22:30:18 -0400 (EDT) -Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5F2Th852641 - for ; Wed, 14 Jun 2000 22:29:43 -0400 (EDT) -Received: (from pgman@localhost) - by candle.pha.pa.us (8.9.0/8.9.0) id WAA06576; - Wed, 14 Jun 2000 22:28:53 -0400 (EDT) -From: Bruce Momjian -Message-Id: <200006150228.WAA06576@candle.pha.pa.us> -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <16705.961035690@sss.pgh.pa.us> "from Tom Lane at Jun 14, 2000 10:21:30 - pm" -To: Tom Lane -Date: Wed, 14 Jun 2000 22:28:53 -0400 (EDT) -CC: Jan Wieck , Oliver Elphick , - PostgreSQL-development -X-Mailer: ELM [version 2.4ME+ PL77 (25)] -MIME-Version: 1.0 -Content-Transfer-Encoding: 7bit -Content-Type: text/plain; charset=US-ASCII -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> Bruce Momjian writes: -> > You need something that works from the command line, and something that -> > works if PostgreSQL is not running. How would you restore one file from -> > a tape. -> -> "Restore one file from a tape"? How are you going to do that anyway? -> You can't save and restore portions of a database like that, because -> of transaction commit status problems. To restore table X correctly, -> you'd have to restore pg_log as well, and then your other tables are -> hosed --- unless you also restore all of them from the backup. Only -> a complete database restore from tape would work, and for that you -> don't need to tell which file is which. So the above argument is a -> red herring. -> -> I realize it's nice to be able to tell which table file is which by -> eyeball, but the price we are paying for that small convenience is -> just too high. Give that up, and we can have rollbackable DROP and -> RENAME now (I'll personally commit to making it happen for 7.1). -> Continue to insist on it, and I don't think we'll *ever* have those -> features in a really robust form. It's just not possible to do -> multiple file renames atomically. -> - -OK, I am flexible. (Yea, right.) :-) - -But seriously, let me give some background. I used Ingres, that used -the VMS file system, but used strange sequential AAAF324 numbers for -tables. When someone deleted a table, or we were looking at what tables -were using disk space, it was impossible to find the Ingres table names -that went with the file. There was a system table that showed it, but -it was poorly documented, and if you deleted the table, there was no way -to look on the tape to find out which file to restore. - -As far as pg_log, you certainly would not expect to get any information -back from the time of the backup table to current, so the current pg_log -would be just fine. - -Basically, I guess we have to do it, but we have to print the proper -error messages for cases in the backend we just print the file name. -Also, we have to now replace the 'ls -l' command with something that -will be meaningful. - -Right now, we use 'ps' with args to display backend information, and ls --l to show disk information. We are going to lose that here. - - - --- - Bruce Momjian | http://www.op.net/~candle - pgman@candle.pha.pa.us | (610) 853-3000 - + If your life is a hard drive, | 830 Blythe Avenue - + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 - -From tgl@sss.pgh.pa.us Wed Jun 14 22:31:01 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA09340 - for ; Wed, 14 Jun 2000 22:31:00 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16783 - for ; Wed, 14 Jun 2000 22:31:34 -0400 (EDT) -To: Bruce Momjian -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006150223.WAA06516@candle.pha.pa.us> -References: <200006150223.WAA06516@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 14 Jun 2000 22:23:58 -0400" -Date: Wed, 14 Jun 2000 22:31:33 -0400 -Message-ID: <16780.961036293@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -> Can I phone you? - -Sure, I'm here. - - regards, tom lane - -From pgsql-hackers-owner+M3383@hub.org Wed Jun 14 22:38:29 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA27501 - for ; Wed, 14 Jun 2000 22:38:28 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5F2bD870244; - Wed, 14 Jun 2000 22:37:13 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5F2af869743 - for ; Wed, 14 Jun 2000 22:36:41 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16814; - Wed, 14 Jun 2000 22:36:19 -0400 (EDT) -To: Bruce Momjian -cc: Jan Wieck , Oliver Elphick , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006150228.WAA06576@candle.pha.pa.us> -References: <200006150228.WAA06576@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 14 Jun 2000 22:28:53 -0400" -Date: Wed, 14 Jun 2000 22:36:19 -0400 -Message-ID: <16810.961036579@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: ROr - -Bruce Momjian writes: -> But seriously, let me give some background. I used Ingres, that used -> the VMS file system, but used strange sequential AAAF324 numbers for -> tables. When someone deleted a table, or we were looking at what tables -> were using disk space, it was impossible to find the Ingres table names -> that went with the file. There was a system table that showed it, but -> it was poorly documented, and if you deleted the table, there was no way -> to look on the tape to find out which file to restore. - -Fair enough, but it seems to me that the answer is to expend some effort -on system admin support tools. We could do a lot in that line with less -effort than trying to make a fundamentally mismatched filesystem -representation do what we need. - - regards, tom lane - -From tgl@sss.pgh.pa.us Wed Jun 14 23:13:35 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA06306 - for ; Wed, 14 Jun 2000 23:13:26 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA16988; - Wed, 14 Jun 2000 23:13:53 -0400 (EDT) -To: Bruce Momjian -cc: Jan Wieck , Oliver Elphick , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006150244.WAA27741@candle.pha.pa.us> -References: <200006150244.WAA27741@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 14 Jun 2000 22:44:16 -0400" -Date: Wed, 14 Jun 2000 23:13:52 -0400 -Message-ID: <16985.961038832@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Bruce Momjian writes: -> That was my point --- that in doing this change, we are taking on more -> TODO items, that may detract from our main TODO items. - -True, but they are also TODO items that could be handled by people other -than the inner circle of key developers. The actual rejiggering of -table-to-filename mapping is going to have to be done by one of the -small number of people who are fully up to speed on backend internals. -But we've got a lot more folks who would be able (and, hopefully, -willing) to design and code whatever tools are needed to make the -dbadmin's job easier in the face of the new filesystem layout. I'd -rather not expend a lot of core time to avoid needing those tools, -especially when I feel the old approach is fatally flawed anyway. - -> Even gdb shows us the filename/tablename in backtraces. We are never -> going to be able to reproduce that. - -Backtraces from *what*, exactly? 99% of the backend is still going -to be dealing with the same data as ever. It might be that poking -around in fd.c will be a little harder, but considering that fd.c -doesn't really know or care what the files it's manipulating are -anyway, I'm not convinced that this is a real issue. - -> I guess I don't consider table schema commands inside transactions and -> such to be as big an items as the utility features we will need to -> build. - -You've *got* to be kidding. We're constantly seeing complaints about -the fact that rolling back DROP or RENAME TABLE fails --- and worse, -leaves the table in a corrupted/inconsistent state. As far as I can -tell, that's one of the worst robustness problems we've got left to -fix. This is a big deal IMHO, and I want it to be fixed and fixed -right. I don't see how to fix it right if we try to keep physical -filenames tied to logical tablenames. - -Moreover, that restriction will continue to hurt us if we try to -preserve it while implementing tablespaces, ANSI schemas, etc. - - regards, tom lane - -From pgsql-hackers-owner+M3387@hub.org Wed Jun 14 23:16:56 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA07268 - for ; Wed, 14 Jun 2000 23:16:54 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5F3Em841832; - Wed, 14 Jun 2000 23:14:48 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5F3EG841655 - for ; Wed, 14 Jun 2000 23:14:16 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA16988; - Wed, 14 Jun 2000 23:13:53 -0400 (EDT) -To: Bruce Momjian -cc: Jan Wieck , Oliver Elphick , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006150244.WAA27741@candle.pha.pa.us> -References: <200006150244.WAA27741@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 14 Jun 2000 22:44:16 -0400" -Date: Wed, 14 Jun 2000 23:13:52 -0400 -Message-ID: <16985.961038832@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: ROr - -Bruce Momjian writes: -> That was my point --- that in doing this change, we are taking on more -> TODO items, that may detract from our main TODO items. - -True, but they are also TODO items that could be handled by people other -than the inner circle of key developers. The actual rejiggering of -table-to-filename mapping is going to have to be done by one of the -small number of people who are fully up to speed on backend internals. -But we've got a lot more folks who would be able (and, hopefully, -willing) to design and code whatever tools are needed to make the -dbadmin's job easier in the face of the new filesystem layout. I'd -rather not expend a lot of core time to avoid needing those tools, -especially when I feel the old approach is fatally flawed anyway. - -> Even gdb shows us the filename/tablename in backtraces. We are never -> going to be able to reproduce that. - -Backtraces from *what*, exactly? 99% of the backend is still going -to be dealing with the same data as ever. It might be that poking -around in fd.c will be a little harder, but considering that fd.c -doesn't really know or care what the files it's manipulating are -anyway, I'm not convinced that this is a real issue. - -> I guess I don't consider table schema commands inside transactions and -> such to be as big an items as the utility features we will need to -> build. - -You've *got* to be kidding. We're constantly seeing complaints about -the fact that rolling back DROP or RENAME TABLE fails --- and worse, -leaves the table in a corrupted/inconsistent state. As far as I can -tell, that's one of the worst robustness problems we've got left to -fix. This is a big deal IMHO, and I want it to be fixed and fixed -right. I don't see how to fix it right if we try to keep physical -filenames tied to logical tablenames. - -Moreover, that restriction will continue to hurt us if we try to -preserve it while implementing tablespaces, ANSI schemas, etc. - - regards, tom lane - -From pgsql-hackers-owner+M3397@hub.org Thu Jun 15 03:03:33 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24286 - for ; Thu, 15 Jun 2000 03:03:32 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5F72T815284; - Thu, 15 Jun 2000 03:02:29 -0400 (EDT) -Received: from mailo.vtcif.telstra.com.au (mailo.vtcif.telstra.com.au [202.12.144.17]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5F721814963 - for ; Thu, 15 Jun 2000 03:02:01 -0400 (EDT) -Received: (from uucp@localhost) by mailo.vtcif.telstra.com.au (8.8.2/8.6.9) id RAA01186; Thu, 15 Jun 2000 17:01:48 +1000 (EST) -Received: from maili.vtcif.telstra.com.au(202.12.142.17) - via SMTP by mailo.vtcif.telstra.com.au, id smtpd0SbI.z; Thu Jun 15 17:00:39 2000 -Received: (from uucp@localhost) by maili.vtcif.telstra.com.au (8.8.2/8.6.9) id RAA21419; Thu, 15 Jun 2000 17:00:37 +1000 (EST) -Received: from localhost(127.0.0.1), claiming to be "mail.cdn.telstra.com.au" - via SMTP by localhost, id smtpdWTHrU_; Thu Jun 15 16:59:34 2000 -Received: from lunitari.nimrod.itg.telecom.com.au (lunitari.nimrod.itg.telecom.com.au [192.53.254.48]) by mail.cdn.telstra.com.au (8.8.2/8.6.9) with ESMTP id QAA04796; Thu, 15 Jun 2000 16:59:33 +1000 (EST) -Received: from nimrod.itg.telecom.com.au (majere [192.53.254.45]) - by lunitari.nimrod.itg.telecom.com.au (8.9.1/8.9.3) with ESMTP id QAA18056; - Thu, 15 Jun 2000 16:58:17 +1000 (EST) -Message-ID: <39487E0C.970680AB@nimrod.itg.telecom.com.au> -Date: Thu, 15 Jun 2000 16:56:12 +1000 -From: Chris Bitmead -Organization: IBM Global Services -X-Mailer: Mozilla 4.6 [en] (X11; I; SunOS 5.6 sun4u) -X-Accept-Language: en -MIME-Version: 1.0 -To: "Ross J. Reedstrom" -CC: PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -"Ross J. Reedstrom" wrote: - -> Any strong objections to the mixed relname_oid solution? It gets us -> everything oids does, and still lets Bruce use 'ls -l' to find the big -> tables, putting off writing any admin tools that'll need to be rewritten, -> anyway. - -Doesn't relname_oid defeat the purpose of oid file names, which is that -they don't change when the table is renamed? Wasn't it going to be oids -with a tool to create a symlink of relname -> oid ? - -From pgsql-hackers-owner+M3400@hub.org Thu Jun 15 03:31:16 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24604 - for ; Thu, 15 Jun 2000 03:31:15 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA01191 for ; Thu, 15 Jun 2000 03:15:28 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5F7CP835301; - Thu, 15 Jun 2000 03:12:25 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5F7Bt833744 - for ; Thu, 15 Jun 2000 03:11:55 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA18801; - Thu, 15 Jun 2000 03:11:53 -0400 (EDT) -To: "Ross J. Reedstrom" -cc: PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <20000615010312.A995@rice.edu> -References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu> -Comments: In-reply-to "Ross J. Reedstrom" - message dated "Thu, 15 Jun 2000 01:03:12 -0500" -Date: Thu, 15 Jun 2000 03:11:52 -0400 -Message-ID: <18798.961053112@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -"Ross J. Reedstrom" writes: -> Any strong objections to the mixed relname_oid solution? - -Yes! - -You cannot make it work reliably unless the relname part is the original -relname and does not track ALTER TABLE RENAME. IMHO having an obsolete -relname in the filename is worse than not having the relname at all; -it's a recipe for confusion, it means you still need admin tools to tell -which end is really up, and what's worst is you might think you don't. - -Furthermore it requires an additional column in pg_class to keep track -of the original relname, which is a waste of space and effort. - -It also creates a portability risk, or at least fails to remove one, -since you are critically dependent on the assumption that the OS -supports long filenames --- on a filesystem that truncates names to less -than about 45 characters you're in very deep trouble. An OID-only -approach still works on traditional 14-char-filename Unix filesystems -(it'd mostly even work on DOS 8+3, though I doubt we care about that). - -Finally, one of the reasons I want to go to filenames based only on OID -is that that'll make life easier for mdblindwrt. Original relname + OID -doesn't help, in fact it makes life harder (more shmem space needed to -keep track of the filename for each buffer). - -Can we *PLEASE JUST LET GO* of this bad idea? No relname in the -filename. Period. - - regards, tom lane - -From tgl@sss.pgh.pa.us Thu Jun 15 03:31:11 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24592 - for ; Thu, 15 Jun 2000 03:31:10 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA01213 for ; Thu, 15 Jun 2000 03:15:46 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA18833; - Thu, 15 Jun 2000 03:14:30 -0400 (EDT) -To: Bruce Momjian -cc: Jan Wieck , Oliver Elphick , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006150321.XAA09510@candle.pha.pa.us> -References: <200006150321.XAA09510@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 14 Jun 2000 23:21:15 -0400" -Date: Thu, 15 Jun 2000 03:14:30 -0400 -Message-ID: <18830.961053270@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Bruce Momjian writes: -> Well, we did have someone do a test implementation of oid file names, -> and their report was that is looked pretty ugly. However, if people are -> convinced it has to be done, we can get started. I guess I was waiting -> for Vadim's storage manager, where the whole idea of separate files is -> going to go away anyway, I suspect. We would then have to re-write all -> our admin tools for the new format. - -I seem to recall him saying that he wanted to go to filename == OID -just like I'm suggesting. But I agree we probably ought to hold off -doing anything until he gets back from Russia and can let us know -whether that's still his plan. If he is planning one-huge-file or -something like that, we might as well let these issues go unfixed -for one more release cycle. - - regards, tom lane - -From pgsql-hackers-owner+M3401@hub.org Thu Jun 15 03:31:15 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24601 - for ; Thu, 15 Jun 2000 03:31:14 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA01428 for ; Thu, 15 Jun 2000 03:19:39 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5F7GP843802; - Thu, 15 Jun 2000 03:16:25 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5F7Fr842651 - for ; Thu, 15 Jun 2000 03:15:53 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA18833; - Thu, 15 Jun 2000 03:14:30 -0400 (EDT) -To: Bruce Momjian -cc: Jan Wieck , Oliver Elphick , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006150321.XAA09510@candle.pha.pa.us> -References: <200006150321.XAA09510@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 14 Jun 2000 23:21:15 -0400" -Date: Thu, 15 Jun 2000 03:14:30 -0400 -Message-ID: <18830.961053270@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Bruce Momjian writes: -> Well, we did have someone do a test implementation of oid file names, -> and their report was that is looked pretty ugly. However, if people are -> convinced it has to be done, we can get started. I guess I was waiting -> for Vadim's storage manager, where the whole idea of separate files is -> going to go away anyway, I suspect. We would then have to re-write all -> our admin tools for the new format. - -I seem to recall him saying that he wanted to go to filename == OID -just like I'm suggesting. But I agree we probably ought to hold off -doing anything until he gets back from Russia and can let us know -whether that's still his plan. If he is planning one-huge-file or -something like that, we might as well let these issues go unfixed -for one more release cycle. - - regards, tom lane - -From ZeugswetterA@wien.spardat.at Thu Jun 15 03:30:59 2000 -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24584 - for ; Thu, 15 Jun 2000 03:30:56 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id JAA29140; - Thu, 15 Jun 2000 09:31:12 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Thu, 15 Jun 2000 09:31:12 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE4@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Tom Lane'" , Bruce Momjian -Cc: Jan Wieck , Oliver Elphick , - PostgreSQL-development -Subject: AW: [HACKERS] Big 7.1 open items -Date: Thu, 15 Jun 2000 09:31:11 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -Status: RO - - -> Bruce Momjian writes: -> > You need something that works from the command line, and -> something that -> > works if PostgreSQL is not running. How would you restore -> one file from -> > a tape. -> -> "Restore one file from a tape"? How are you going to do that anyway? -> You can't save and restore portions of a database like that, because -> of transaction commit status problems. To restore table X correctly, -> you'd have to restore pg_log as well, and then your other tables are -> hosed --- unless you also restore all of them from the backup. Only -> a complete database restore from tape would work, and for that you -> don't need to tell which file is which. So the above argument is a -> red herring. - ->From what I know it is possible to simply restore one table file -since pg_log keeps all tid's. Of course it cannot guarantee integrity -and does not work if the table was altered. - -> I realize it's nice to be able to tell which table file is which by -> eyeball, but the price we are paying for that small convenience is -> just too high. Give that up, and we can have rollbackable DROP and -> RENAME now (I'll personally commit to making it happen for 7.1). -> Continue to insist on it, and I don't think we'll *ever* have those -> features in a really robust form. It's just not possible to do -> multiple file renames atomically. - -In the last proposal Bruce and I had it all layed out for tabname + oid -with no overhead in the normal situation, and little overhead if a rename -table crashed or was not rolled back or committed properly -which imho had all advantages combined. - -Andreas - -From ZeugswetterA@wien.spardat.at Thu Jun 15 04:31:04 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA25144 - for ; Thu, 15 Jun 2000 04:31:03 -0400 (EDT) -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id EAA03225 for ; Thu, 15 Jun 2000 04:05:41 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA100894; - Thu, 15 Jun 2000 10:04:52 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Thu, 15 Jun 2000 10:04:52 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE7@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Don Baccus'" , - Bruce Momjian - , Tom Lane -Cc: Jan Wieck , Oliver Elphick , - PostgreSQL-development -Subject: AW: [HACKERS] Big 7.1 open items -Date: Thu, 15 Jun 2000 10:04:51 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="windows-1252" -Status: RO - - -> In reality, very few people are going to be interested in restoring -> a table in a way that breaks referential integrity and other -> normal assumptions about what exists in the database. - -This is not true. In my DBA history it would have saved me manweeks -of work if an easy and efficient restore of one single table from backup -would have been available in Informix and Oracle. -We allways had to restore most of the whole system to another machine only -to get back at some table info that would then be manually re-added -to the production system. -A restore of one table to a different/new tablename would have been -very convenient, and this is currently possible in PostgreSQL. -(create new table with same schema, then replace new table data file -with file from backup) - -> The reality -> is that most people are going to engage in a little time travel -> to a past, consistent backup rather than do as you suggest. - -No, this is what is done most of the time, but it is very inconvenient -to tell people that they loose all work from past days, so it is usually -done as I noted above if possible. We once had a situation where all data -was deleted from a table, but the problem was only noticed 3 weeks later. - -> This is going to be more and more true as Postgres gains more and -> more acceptance in (no offense intended) the real world. -> -> >Right now, we use 'ps' with args to display backend -> information, and ls -> >-l to show disk information. We are going to lose that here. -> -> Dependence on "ls -l" is, IMO, a very weak argument. - -In normal situations where everything works I agree, it is the -error situations where it really helps if you see what data is where. -debugging, lsof, Bruce already named them. - -Andreas - -From pgsql-hackers-owner+M3405@hub.org Thu Jun 15 04:31:09 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA25151 - for ; Thu, 15 Jun 2000 04:31:07 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id EAA04151 for ; Thu, 15 Jun 2000 04:30:23 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5F8RI883087; - Thu, 15 Jun 2000 04:27:18 -0400 (EDT) -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5F8Qx881928 - for ; Thu, 15 Jun 2000 04:27:00 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA79848; - Thu, 15 Jun 2000 10:26:13 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Thu, 15 Jun 2000 10:26:14 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE8@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Tom Lane'" , - "Ross J. Reedstrom" - -Cc: PostgreSQL-development -Subject: AW: [HACKERS] Big 7.1 open items -Date: Thu, 15 Jun 2000 10:26:12 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: ROr - - -> "Ross J. Reedstrom" writes: -> > Any strong objections to the mixed relname_oid solution? -> -> Yes! -> -> You cannot make it work reliably unless the relname part is -> the original -> relname and does not track ALTER TABLE RENAME. - -It does, or should at least. Only problem case is where db crashes during -alter or commit/rollback. This could be fixed by first open that fails to -find the file -or vacuum, or some other utility. - -> IMHO having -> an obsolete -> relname in the filename is worse than not having the relname at all; -> it's a recipe for confusion, it means you still need admin -> tools to tell -> which end is really up, and what's worst is you might think you don't. -> -> Furthermore it requires an additional column in pg_class to keep track -> of the original relname, which is a waste of space and effort. - -it does not. - -> Finally, one of the reasons I want to go to filenames based -> only on OID -> is that that'll make life easier for mdblindwrt. Original -> relname + OID -> doesn't help, in fact it makes life harder (more shmem space needed to -> keep track of the filename for each buffer). - -I do not see this. filename is constructed from relname+oid. -if not found, do directory scan for *_.dat, if found --> rename. - -Andreas - -From pgsql-hackers-owner+M3407@hub.org Thu Jun 15 05:01:03 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA25462 - for ; Thu, 15 Jun 2000 05:01:02 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id EAA04667 for ; Thu, 15 Jun 2000 04:45:51 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5F8gr817124; - Thu, 15 Jun 2000 04:42:53 -0400 (EDT) -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5F8gX815763 - for ; Thu, 15 Jun 2000 04:42:34 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA29072; - Thu, 15 Jun 2000 10:41:51 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Thu, 15 Jun 2000 10:41:51 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE9@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Tom Lane'" -Cc: PostgreSQL-development -Subject: AW: [HACKERS] Big 7.1 open items -Date: Thu, 15 Jun 2000 10:41:50 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> It's just not possible to do -> multiple file renames atomically. - -This is not necessary, since *_ is unique regardless of relname prefix. - -Andreas - -From scrappy@hub.org Thu Jun 15 08:30:59 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA03846 - for ; Thu, 15 Jun 2000 08:30:58 -0400 (EDT) -Received: from thelab.hub.org (nat193.152.mpoweredpc.net [142.177.193.152]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id IAA14167 for ; Thu, 15 Jun 2000 08:16:58 -0400 (EDT) -Received: from localhost (scrappy@localhost) - by thelab.hub.org (8.9.3/8.9.3) with ESMTP id JAA74856; - Thu, 15 Jun 2000 09:14:29 -0300 (ADT) - (envelope-from scrappy@hub.org) -X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs -Date: Thu, 15 Jun 2000 09:14:29 -0300 (ADT) -From: The Hermit Hacker -To: Bruce Momjian -cc: Tom Lane , Jan Wieck , - Oliver Elphick , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <200006150321.XAA09510@candle.pha.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=US-ASCII -Status: RO - -On Wed, 14 Jun 2000, Bruce Momjian wrote: - -> > Backtraces from *what*, exactly? 99% of the backend is still going -> > to be dealing with the same data as ever. It might be that poking -> > around in fd.c will be a little harder, but considering that fd.c -> > doesn't really know or care what the files it's manipulating are -> > anyway, I'm not convinced that this is a real issue. -> -> I was just throwing gdb out as an example. The bigger ones are ls, -> lsof/fstat, and tar. - -You've lost me on this one ... if someone does an lsof of the process, it -will still provide them a list of open files ... are you complaining about -the extra step required to translate the file name to a "valid table"? - -Oh, one point here ... this whole 'filenaming issue' ... as far as ls is -concerned, at least, only affects the superuser, since he's the only one -that can go 'ls'ng around i nthe directories ... - -And, ummm, how hard would it be to have \d in psql display the "physical -table name" as part of its output? - -Slight tangent here: - -One thing that I think would be great if we could add is some sort of: - -SELECT db_name, disk_space; - -query wher a database owner, not the superuser, could see how much disk -space their tables are using up ... possible? - - -From pgsql-hackers-owner+M3412@hub.org Thu Jun 15 08:30:55 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA03842 - for ; Thu, 15 Jun 2000 08:30:54 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id IAA15241 for ; Thu, 15 Jun 2000 08:31:29 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5FCSM877572; - Thu, 15 Jun 2000 08:28:22 -0400 (EDT) -Received: from zrtps06s.us.nortel.com ([47.140.48.50]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5FCRS877255 - for ; Thu, 15 Jun 2000 08:27:28 -0400 (EDT) -Received: from ertpg15e1.nortelnetworks.com (actually zrtph06n.us.nortel.com) - by zrtps06s.us.nortel.com; Thu, 15 Jun 2000 08:26:34 -0400 -Received: from zrtpd004.us.nortel.com (actually zrtpd004) - by ertpg15e1.nortelnetworks.com; Thu, 15 Jun 2000 08:26:11 -0400 -Received: from zrtpd003.us.nortel.com ([47.140.224.137]) - by zrtpd004.us.nortel.com - with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) - id MPQCZWMM; Thu, 15 Jun 2000 08:26:10 -0400 -Received: from americasm01.nt.com (hrtpp28d.us.nortel.com [47.190.110.250]) - by zrtpd003.us.nortel.com - with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) - id L1N0XG78; Thu, 15 Jun 2000 08:26:12 -0400 -Message-ID: <3948CBDC.5A4F5705@americasm01.nt.com> -Date: Thu, 15 Jun 2000 08:28:12 -0400 -X-Sybari-Space: 00000000 00000000 00000000 -From: "Mark Hollomon" -Reply-To: "Mark Hollomon" -Organization: Nortel Networks -X-Mailer: Mozilla 4.04 [en] (Win95; U) -MIME-Version: 1.0 -To: "Ross J. Reedstrom" -CC: PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -X-Orig: -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Ross J. Reedstrom wrote: -> -> Any strong objections to the mixed relname_oid solution? It gets us -> everything oids does, and still lets Bruce use 'ls -l' to find the big -> tables, putting off writing any admin tools that'll need to be rewritten, -> anyway. - -I would object to the mixed name. - -Consider: - -CREATE TABLE FOO .... -ALTER TABLE FOO RENAME FOO_OLD; -CREATE TABLE FOO .... - -For the same atomicity reason, rename can't change the -name of the files. So, which foo_ is the FOO_OLD -and which is FOO? - -In other words, in the presence of rename, putting -relname in the filename is misleading at best. - --- - -Mark Hollomon -mhh@nortelnetworks.com -ESN 451-9008 (302)454-9008 - -From pgsql-hackers-owner+M3413@hub.org Thu Jun 15 08:30:47 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA03837 - for ; Thu, 15 Jun 2000 08:30:45 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5FCTb883200; - Thu, 15 Jun 2000 08:29:37 -0400 (EDT) -Received: from smtp1.andrew.cmu.edu (SMTP1.ANDREW.CMU.EDU [128.2.10.81]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5FCT7881265 - for ; Thu, 15 Jun 2000 08:29:07 -0400 (EDT) -Received: from export.andrew.cmu.edu (EXPORT.ANDREW.CMU.EDU [128.2.23.2]) - by smtp1.andrew.cmu.edu (8.9.3/8.9.3) with ESMTP id IAA02782 - for ; Thu, 15 Jun 2000 08:29:02 -0400 (EDT) -Date: Thu, 15 Jun 2000 08:29:02 -0400 (EDT) -Message-Id: <200006151229.IAA02782@smtp1.andrew.cmu.edu> -From: Brian E Gallew -X-Mailer: BatIMail version 3.2 -To: "PostgreSQL-development" -In-reply-to: <16810.961036579@sss.pgh.pa.us> -Subject: Re: [HACKERS] Big 7.1 open items -References: <200006150228.WAA06576@candle.pha.pa.us> <16810.961036579@sss.pgh.pa.us> -Mime-Version: 1.0 (generated by tm-edit 7.106) -Content-Type: multipart/signed; protocol="application/pgp-signature"; - boundary="pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1"; micalg=pgp-md5 -Content-Transfer-Encoding: 7bit -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - - ---pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1 -Content-Type: text/plain; charset=US-ASCII - -Then spoke up and said: -> Precedence: bulk -> -> Bruce Momjian writes: -> > But seriously, let me give some background. I used Ingres, that used -> > the VMS file system, but used strange sequential AAAF324 numbers for -> > tables. When someone deleted a table, or we were looking at what tables -> > were using disk space, it was impossible to find the Ingres table names -> > that went with the file. There was a system table that showed it, but -> > it was poorly documented, and if you deleted the table, there was no way -> > to look on the tape to find out which file to restore. -> -> Fair enough, but it seems to me that the answer is to expend some effort -> on system admin support tools. We could do a lot in that line with less -> effort than trying to make a fundamentally mismatched filesystem -> representation do what we need. - -We've been an Ingres shop as long as there's been an Ingres. While -we've also had the problem Bruce noticed with table names, we've -*also* used the trivial fix of running a (simple) Report Writer job -each night, immediately before the backup, that lists all of the -database tables/indicies and the underlying files. - -True, if someone drops/recreates a table twice between backups we -can't find the intermediate file name, but since we also haven't -backed up that filename, this isn't an issue. - -Also, the consistency issue is really not as important as you would -think. If you are restoring a table, you want the information in it, -whether or not it's consistent with anything else. I've done hundreds -of table restores (can you say "modify table to heap"?) and never once -has inconsistency been an issue. Oh, yeah, and we don't shut the -database down for this, either. (That last isn't my choice, BTW.) - --- -===================================================================== -| JAVA must have been developed in the wilds of West Virginia. | -| After all, why else would it support only single inheritance?? | -===================================================================== -| Finger geek@cmu.edu for my public key. | -===================================================================== - ---pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1 -Content-Type: application/pgp-signature -Content-Transfer-Encoding: 7bit - ------BEGIN PGP MESSAGE----- -Version: 2.6.2 -Comment: Processed by Mailcrypt 3.3, an Emacs/PGP interface - -iQBVAwUBOUjMDYdzVnzma+gdAQHUowH+JglNasUWT5RKSnF3pzNdy5nyrGmLhbWa -Oom1oUqToxcyfjVFL34dXpnIlvNHO0K2Di4NKZ9HykwOHzrnExf15w== -=yXoe ------END PGP MESSAGE----- - ---pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1-- - - -From dhogaza@pacifier.com Thu Jun 15 09:31:05 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA04418 - for ; Thu, 15 Jun 2000 09:31:04 -0400 (EDT) -Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id JAA20080 for ; Thu, 15 Jun 2000 09:22:36 -0400 (EDT) -Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68]) - by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id GAA05755; - Thu, 15 Jun 2000 06:21:54 -0700 (PDT) -Message-Id: <3.0.1.32.20000615054049.011bcec0@mail.pacifier.com> -X-Sender: dhogaza@mail.pacifier.com -X-Mailer: Windows Eudora Pro Version 3.0.1 (32) -Date: Thu, 15 Jun 2000 05:40:49 -0700 -To: Zeugswetter Andreas SB , - Bruce Momjian , Tom Lane -From: Don Baccus -Subject: Re: AW: [HACKERS] Big 7.1 open items -Cc: Jan Wieck , Oliver Elphick , - PostgreSQL-development -In-Reply-To: <219F68D65015D011A8E000006F8590C604AF7DE7@sdexcsrv1.f000.d0 - 188.sd.spardat.at> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -Status: RO - -At 10:04 AM 6/15/00 +0200, Zeugswetter Andreas SB wrote: -> ->> In reality, very few people are going to be interested in restoring ->> a table in a way that breaks referential integrity and other ->> normal assumptions about what exists in the database. -> ->This is not true. In my DBA history it would have saved me manweeks ->of work if an easy and efficient restore of one single table from backup ->would have been available in Informix and Oracle. ->We allways had to restore most of the whole system to another machine only ->to get back at some table info that would then be manually re-added ->to the production system. - -I'm missing something, I guess. You would do a createdb, do a filesystem -copy of pg_log and one file into it, and then read data from the table - without having to restore the other tables in the database? - -I'm just curious - when was the last time you restored a Postgres -database in this piecemeal manner, and how often do you do it? - - - -- Don Baccus, Portland OR - Nature photos, on-line guides, Pacific Northwest - Rare Bird Alert Service and other goodies at - http://donb.photo.net. - -From pgsql-hackers-owner+M3440@hub.org Thu Jun 15 14:46:22 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA04607 - for ; Thu, 15 Jun 2000 14:46:21 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA12695 for ; Thu, 15 Jun 2000 12:48:58 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5FGjXI40370; - Thu, 15 Jun 2000 12:45:33 -0400 (EDT) -Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5FGjJI39359 - for ; Thu, 15 Jun 2000 12:45:20 -0400 (EDT) -Received: by rice.edu - via sendmail from stdin - id (Debian Smail3.2.0.102) - for pgsql-hackers@postgresql.org; Thu, 15 Jun 2000 11:45:19 -0500 (CDT) -Date: Thu, 15 Jun 2000 11:45:19 -0500 -From: "Ross J. Reedstrom" -To: Tom Lane -Cc: PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -Message-ID: <20000615114519.B3939@rice.edu> -Mail-Followup-To: Tom Lane , - PostgreSQL-development -References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu> <18798.961053112@sss.pgh.pa.us> -Mime-Version: 1.0 -Content-Type: text/plain; charset=us-ascii -User-Agent: Mutt/1.0i -In-Reply-To: <18798.961053112@sss.pgh.pa.us>; from tgl@sss.pgh.pa.us on Thu, Jun 15, 2000 at 03:11:52AM -0400 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: ROr - -On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote: -> "Ross J. Reedstrom" writes: -> > Any strong objections to the mixed relname_oid solution? -> -> Yes! -> -> You cannot make it work reliably unless the relname part is the original -> relname and does not track ALTER TABLE RENAME. IMHO having an obsolete -> relname in the filename is worse than not having the relname at all; -> it's a recipe for confusion, it means you still need admin tools to tell -> which end is really up, and what's worst is you might think you don't. - -The plan here was to let VACUUM handle renaming the file, since it -will already have all the necessary locks. This shortens the window -of confusion. ALTER TABLE RENAME doesn't happen that often, really - -the relname is there just for human consumption, then. - -> -> Furthermore it requires an additional column in pg_class to keep track -> of the original relname, which is a waste of space and effort. -> - -I actually started down this path thinking about implementing SCHEMA, -since tables in the same DB but in different schema can have the same -relname, I figured I needed to change that. We'll need something in -pg_class to keep track of what schema a relation is in, instead. - -> It also creates a portability risk, or at least fails to remove one, -> since you are critically dependent on the assumption that the OS -> supports long filenames --- on a filesystem that truncates names to less -> than about 45 characters you're in very deep trouble. An OID-only -> approach still works on traditional 14-char-filename Unix filesystems -> (it'd mostly even work on DOS 8+3, though I doubt we care about that). - -Actually, no. Since I store the filename in a name attribute, I used this -nifty function somebody wrote, makeObjectName, to trim the relname part, -but leave the oid. (Yes, I know it's yours ;-) - -> -> Finally, one of the reasons I want to go to filenames based only on OID -> is that that'll make life easier for mdblindwrt. Original relname + OID -> doesn't help, in fact it makes life harder (more shmem space needed to -> keep track of the filename for each buffer). - -Can you explain in more detail how this helps? Not by letting the bufmgr -know that oid == filename, I hope. We need to improving the abstraction -of the smgr, not add another violation. Ah, sorry, mdblindwrt _is_ -in the smgr. - -Hmm, grovelling through that code, I see how it could be simpler if reloid -== filename. Heck, we even get to save shmem in the buffdesc.blind part, -since we only need the dbname in there, now. - -Hmm, I see I missed the relpath_blind() in my patch - oops. (relpath() -is always called with RelationGetPhysicalRelationName(), and that's -where I was putting in the relphysname) - -Hmm, what's all this with functions in catalog.c that are only called by -smgr/md.c? seems to me that anything having to do with physical storage -(like the path!) belongs in the smgr abstraction. - -> -> Can we *PLEASE JUST LET GO* of this bad idea? No relname in the -> filename. Period. -> - -Gee, so dogmatic. No one besides Bruce and Hiroshi discussed this _at -all_ when I first put up patches two month ago. O.K., I'll do the oids -only version (and fix up relpath_blind) - -Ross - --- -Ross J. Reedstrom, Ph.D., -NSBRI Research Scientist/Programmer -Computer and Information Technology Institute -Rice University, 6100 S. Main St., Houston, TX 77005 - -From Inoue@tpf.co.jp Thu Jun 15 17:45:40 2000 -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA27548 - for ; Thu, 15 Jun 2000 17:45:37 -0400 (EDT) -Received: from mcadnote1 (ppm122.noc.fukui.nsk.ne.jp [210.161.188.41]) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id GAA07248; Fri, 16 Jun 2000 06:45:30 +0900 -From: "Hiroshi Inoue" -To: "Bruce Momjian" , - "Ross J. Reedstrom" -Cc: "Tom Lane" , - "PostgreSQL-development" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Fri, 16 Jun 2000 06:48:21 +0900 -Message-ID: -MIME-Version: 1.0 -Content-Type: text/plain; - charset="us-ascii" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) -In-Reply-To: <200006151935.PAA17512@candle.pha.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 -Importance: Normal -Status: ROr - -> -----Original Message----- -> From: pgsql-hackers-owner@hub.org -> [mailto:pgsql-hackers-owner@hub.org]On Behalf Of Bruce Momjian -> -> > > Can we *PLEASE JUST LET GO* of this bad idea? No relname in the -> > > filename. Period. -> > > -> > -> > Gee, so dogmatic. No one besides Bruce and Hiroshi discussed this _at -> > all_ when I first put up patches two month ago. O.K., I'll do the oids -> > only version (and fix up relpath_blind) -> -> Hold on. I don't think we want that work done yet. Seems even Tom is -> thinking that if Vadim is going to re-do everything later anyway, we may -> be better with a relname/oid solution that does require additional -> administration apps. -> - -Hmm,why is naming rule first ? - -I've never enphasized naming rule except that it should be unique. -It has been my main point to reduce the necessity of naming rule -as possible. IIRC,by keeping the stored place in pg_class,Ross's -trial patch remains only 2 places where naming rule is required. -So wouldn't we be free from naming rule(it would not be so difficult -to change naming rule if the rule is found to be bad) ? - -I've also mentioned many times neither relname nor oid is sufficient -for the uniqueness. In addiiton neither relname nor oid would be -necessary for the uniqueness. -IMHO,it's bad to rely on the item which is neither necessary nor -sufficient. -I proposed relname+unique_id naming once. The unique_id is -independent from oid. The relname is only for convinience for -DBA and so we don't have to change it due to RENAME. -Db's consistency is much more important than dba's satis- -faction. - -Comments ? - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - -From pgsql-hackers-owner+M3448@hub.org Thu Jun 15 19:01:03 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA00764 - for ; Thu, 15 Jun 2000 19:01:02 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id SAA17328 for ; Thu, 15 Jun 2000 18:57:32 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5FMsMI97744; - Thu, 15 Jun 2000 18:54:22 -0400 (EDT) -Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5FMs0I94252 - for ; Thu, 15 Jun 2000 18:54:00 -0400 (EDT) -Received: by rice.edu - via sendmail from stdin - id (Debian Smail3.2.0.102) - for pgsql-hackers@postgresql.org; Thu, 15 Jun 2000 17:53:59 -0500 (CDT) -Date: Thu, 15 Jun 2000 17:53:59 -0500 -From: "Ross J. Reedstrom" -To: PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -Message-ID: <20000615175359.A12194@rice.edu> -Mail-Followup-To: PostgreSQL-development -References: <200006152148.RAA27790@candle.pha.pa.us> -Mime-Version: 1.0 -Content-Type: text/plain; charset=us-ascii -User-Agent: Mutt/1.0i -In-Reply-To: <200006152148.RAA27790@candle.pha.pa.us>; from pgman@candle.pha.pa.us on Thu, Jun 15, 2000 at 05:48:59PM -0400 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -On Thu, Jun 15, 2000 at 05:48:59PM -0400, Bruce Momjian wrote: -> > I've also mentioned many times neither relname nor oid is sufficient -> > for the uniqueness. In addiiton neither relname nor oid would be -> > necessary for the uniqueness. -> > IMHO,it's bad to rely on the item which is neither necessary nor -> > sufficient. -> > I proposed relname+unique_id naming once. The unique_id is -> > independent from oid. The relname is only for convinience for -> > DBA and so we don't have to change it due to RENAME. -> > Db's consistency is much more important than dba's satis- -> > faction. -> > -> > Comments ? -> -> I am happy not to rename the file on 'RENAME', but seems no one likes -> that. - -Good, 'cause that's how I've implemented it so far. Actually, all -I've done is port my previous patch to current, with one little -change: I added a macro RelationGetRealRelationName which does what -RelationGetPhysicalRelationName used to do: i.e. return the relname with -no temptable funny business, and used that for the relcache macros. It -passes all the serial regression tests: I haven't run the parallel tests -yet. ALTER TABLE RENAME rollsback nicely. I'll need to learn some omre -about xacts to get DROP TABLE rolling back. - -I'll drop it on PATCHES right now, for comment. - -Ross --- -Ross J. Reedstrom, Ph.D., -NSBRI Research Scientist/Programmer -Computer and Information Technology Institute -Rice University, 6100 S. Main St., Houston, TX 77005 - -From pgsql-hackers-owner+M3451@hub.org Thu Jun 15 20:01:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA01651 - for ; Thu, 15 Jun 2000 20:00:59 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA20985 for ; Thu, 15 Jun 2000 19:57:49 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5FNsgI25402; - Thu, 15 Jun 2000 19:54:42 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5FNsCI22412 - for ; Thu, 15 Jun 2000 19:54:12 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA02263; - Thu, 15 Jun 2000 19:53:52 -0400 (EDT) -To: "Ross J. Reedstrom" -cc: PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <20000615114519.B3939@rice.edu> -References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu> <18798.961053112@sss.pgh.pa.us> <20000615114519.B3939@rice.edu> -Comments: In-reply-to "Ross J. Reedstrom" - message dated "Thu, 15 Jun 2000 11:45:19 -0500" -Date: Thu, 15 Jun 2000 19:53:52 -0400 -Message-ID: <2260.961113232@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -"Ross J. Reedstrom" writes: -> On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote: ->> "Ross J. Reedstrom" writes: ->>>> Any strong objections to the mixed relname_oid solution? ->> ->> Yes! - -> The plan here was to let VACUUM handle renaming the file, since it -> will already have all the necessary locks. This shortens the window -> of confusion. ALTER TABLE RENAME doesn't happen that often, really - -> the relname is there just for human consumption, then. - -Yeah, I've seen tons of discussion of how if we do this, that, and -the other thing, and be prepared to fix up some other things in case -of crash recovery, we can make it work with filename == relname + OID -(where relname tracks logical name, at least at some remove). - -Probably. Assuming nobody forgets anything. - -I'm just trying to point out that that's a huge amount of pretty -delicate mechanism. The amount of work required to make it trustworthy -looks to me to dwarf the admin tools that Bruce is complaining about. -And we only have a few people competent to do the work. (With all -due respect, Ross, if you weren't already aware of the implications -for mdblindwrt, I have to wonder what else you missed.) - -Filename == OID is so simple, reliable, and straightforward by -comparison that I think the decision is a no-brainer. - -If we could afford to sink unlimited time into this one issue then -it might make sense to do it the hard way, but we have enough -important stuff on our TODO list to keep us all busy for years --- -I cannot believe that it's an effective use of our time to do this. - - -> Hmm, what's all this with functions in catalog.c that are only called by -> smgr/md.c? seems to me that anything having to do with physical storage -> (like the path!) belongs in the smgr abstraction. - -Yeah, there's a bunch of stuff that should have been implemented by -adding new smgr entry points, but wasn't. It should be pushed down. -(I can't resist pointing out that one of those things is physical -relation rename, which will go away and not *need* to be pushed down -if we do it the way I want.) - - regards, tom lane - -From tgl@sss.pgh.pa.us Thu Jun 15 20:00:59 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA01647 - for ; Thu, 15 Jun 2000 20:00:58 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA21034 for ; Thu, 15 Jun 2000 19:58:30 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA02283; - Thu, 15 Jun 2000 19:57:05 -0400 (EDT) -To: Bruce Momjian -cc: "Ross J. Reedstrom" , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006151935.PAA17512@candle.pha.pa.us> -References: <200006151935.PAA17512@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Thu, 15 Jun 2000 15:35:45 -0400" -Date: Thu, 15 Jun 2000 19:57:05 -0400 -Message-ID: <2280.961113425@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Bruce Momjian writes: ->> Gee, so dogmatic. No one besides Bruce and Hiroshi discussed this _at ->> all_ when I first put up patches two month ago. O.K., I'll do the oids ->> only version (and fix up relpath_blind) - -> Hold on. I don't think we want that work done yet. Seems even Tom is -> thinking that if Vadim is going to re-do everything later anyway, we may -> be better with a relname/oid solution that does require additional -> administration apps. - -Don't put words in my mouth, please. If we are going to throw the -work away later, it'd be foolish to do the much greater amount of -work needed to make filename=relname+OID fly than is needed for -filename=OID. - -However, I'm pretty sure I recall Vadim stating that he thought -filename=OID would be required for his smgr changes anyway... - - regards, tom lane - -From pgsql-hackers-owner+M3453@hub.org Thu Jun 15 21:01:01 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA02731 - for ; Thu, 15 Jun 2000 21:01:01 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA23469 for ; Thu, 15 Jun 2000 20:36:36 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5G0WDI97134; - Thu, 15 Jun 2000 20:32:13 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5G0VsI97003 - for ; Thu, 15 Jun 2000 20:31:54 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id JAA07328; Fri, 16 Jun 2000 09:26:04 +0900 -From: "Hiroshi Inoue" -To: "Bruce Momjian" , - "Tom Lane" -Cc: "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Fri, 16 Jun 2000 09:28:14 +0900 -Message-ID: <000d01bfd729$c24b29c0$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -In-Reply-To: <2260.961113232@sss.pgh.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Importance: Normal -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On -> Behalf Of Tom Lane -> -> "Ross J. Reedstrom" writes: -> > On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote: -> >> "Ross J. Reedstrom" writes: -> >>>> Any strong objections to the mixed relname_oid solution? -> >> -> >> Yes! -> -> > The plan here was to let VACUUM handle renaming the file, since it -> > will already have all the necessary locks. This shortens the window -> > of confusion. ALTER TABLE RENAME doesn't happen that often, really - -> > the relname is there just for human consumption, then. -> -> Yeah, I've seen tons of discussion of how if we do this, that, and -> the other thing, and be prepared to fix up some other things in case -> of crash recovery, we can make it work with filename == relname + OID -> (where relname tracks logical name, at least at some remove). -> - -I've seen little discussion of how to avoid the use of naming rule. -I've proposed many times that we should keep the information -where the table is stored in our database itself. I've never seen -clear objections to it. So I could understand my proposal is OK ? -Isn't it much more important than naming rule ? Under the -mechanism,we could easily replace bad naming rule. -And I believe that Ross's work is mostly around the mechanism -not naming rule. - -Now I like neither relname nor oid because it's not sufficient -for my purpose. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From tgl@sss.pgh.pa.us Thu Jun 15 22:01:02 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA03637 - for ; Thu, 15 Jun 2000 22:01:01 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id VAA28521 for ; Thu, 15 Jun 2000 21:58:46 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id VAA02730; - Thu, 15 Jun 2000 21:57:27 -0400 (EDT) -To: "Hiroshi Inoue" -cc: "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <000d01bfd729$c24b29c0$2801007e@tpf.co.jp> -References: <000d01bfd729$c24b29c0$2801007e@tpf.co.jp> -Comments: In-reply-to "Hiroshi Inoue" - message dated "Fri, 16 Jun 2000 09:28:14 +0900" -Date: Thu, 15 Jun 2000 21:57:27 -0400 -Message-ID: <2727.961120647@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -"Hiroshi Inoue" writes: -> Now I like neither relname nor oid because it's not sufficient -> for my purpose. - -We should probably not do much of anything with this issue until -we have a clearer understanding of what we want to do about -tablespaces and schemas. - -My gut feeling is that we will end up with pathnames that look -something like - -.../data/base/DBNAME/TABLESPACE/OIDOFRELATION - -(with .N attached if a segment of a large relation, of course). - -The TABLESPACE "name" should likely be an OID itself, but it wouldn't -have to be if you are willing to say that tablespaces aren't renamable. -(Come to think of it, does anyone care about being able to rename -databases? ;-)) Note that the TABLESPACE will often be a symlink -to storage on another drive, rather than a plain subdirectory of the -DBNAME, but that shouldn't be an issue at this level of discussion. - -I think that schemas probably don't enter into this. We should instead -rely on the uniqueness of OIDs to prevent filename collisions. However, -OIDs aren't really unique: different databases in an installation will -use the same OIDs for their system tables. My feeling is that we can -live with a restriction like "you can't store the system tables of -different databases in the same tablespace". Alternatively we could -avoid that issue by inverting the pathname order: - -.../data/base/TABLESPACE/DBNAME/OIDOFRELATION - -Note that in any case, system tables will have to live in a -predetermined tablespace, since you can't very well look in pg_class -to find out which tablespace pg_class lives in. Perhaps we should -just reserve a tablespace per database for system tables and forget -the whole issue. If we do that, there's not really any need for -the database in the path! Just - -.../data/base/TABLESPACE/OIDOFRELATION - -would do fine and help reduce lookup overhead. - -BTW, schemas do make things interesting for the other camp: -is it possible for the same table to be referenced by different -names in different schemas? If so, just how useful is it to pick -one of those names arbitrarily for the filename? This is an advanced -version of the main objection to using the original relname and not -updating it at RENAME TABLE --- sooner or later, the filenames are -going to be more confusing than helpful. - -Comments? Have I missed something important about schemas? - - regards, tom lane - -From pgsql-hackers-owner+M3457@hub.org Thu Jun 15 22:27:45 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA04586 - for ; Thu, 15 Jun 2000 22:27:44 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5G2POI23418; - Thu, 15 Jun 2000 22:25:24 -0400 (EDT) -Received: from candle.pha.pa.us (pgman@nav-43.dsl.navpoint.com [162.33.245.46]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5G2P3I23299 - for ; Thu, 15 Jun 2000 22:25:04 -0400 (EDT) -Received: (from pgman@localhost) - by candle.pha.pa.us (8.9.0/8.9.0) id WAA04345; - Thu, 15 Jun 2000 22:24:53 -0400 (EDT) -From: Bruce Momjian -Message-Id: <200006160224.WAA04345@candle.pha.pa.us> -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <2727.961120647@sss.pgh.pa.us> "from Tom Lane at Jun 15, 2000 09:57:27 - pm" -To: Tom Lane -Date: Thu, 15 Jun 2000 22:24:52 -0400 (EDT) -CC: Hiroshi Inoue , Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -X-Mailer: ELM [version 2.4ME+ PL77 (25)] -MIME-Version: 1.0 -Content-Transfer-Encoding: 7bit -Content-Type: text/plain; charset=US-ASCII -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> "Hiroshi Inoue" writes: -> > Now I like neither relname nor oid because it's not sufficient -> > for my purpose. -> -> We should probably not do much of anything with this issue until -> we have a clearer understanding of what we want to do about -> tablespaces and schemas. - -Here is an analysis of our options: - - Work required Disadvantages ----------------------------------------------------------------------------- - -Keep current system no work rename/create no rollback - -relname/oid but less work new pg_class column, -no rename change filename not accurate on - rename - -relname/oid with more work complex code -rename change during -vacuum - -oid filename less work, but confusing to admins - need admin tools - --- - Bruce Momjian | http://www.op.net/~candle - pgman@candle.pha.pa.us | (610) 853-3000 - + If your life is a hard drive, | 830 Blythe Avenue - + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 - -From Inoue@tpf.co.jp Thu Jun 15 22:41:50 2000 -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA05230 - for ; Thu, 15 Jun 2000 22:41:48 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id LAA07495; Fri, 16 Jun 2000 11:41:43 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" -Cc: "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Fri, 16 Jun 2000 11:43:52 +0900 -Message-ID: <000201bfd73c$b52873c0$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -In-Reply-To: <2727.961120647@sss.pgh.pa.us> -Importance: Normal -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Status: RO - -Sorry for my previous mail. It was posted by my mistake. - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> "Hiroshi Inoue" writes: -> > Now I like neither relname nor oid because it's not sufficient -> > for my purpose. -> -> We should probably not do much of anything with this issue until -> we have a clearer understanding of what we want to do about -> tablespaces and schemas. -> -> My gut feeling is that we will end up with pathnames that look -> something like -> -> .../data/base/DBNAME/TABLESPACE/OIDOFRELATION -> - -Schema is a logical concept and irrevant to physical location. -I strongly object your suggestion unless above means *default* -location. -Tablespace is an encapsulation of table allocation and the -name should be irrevant to the location basically. So above -seems very bad for me. - -Anyway I don't see any advantage in fixed mapping impleme -ntation. After renewal,we should at least have a possibility to -allocate a specific table in arbitrary separate directory. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From Inoue@tpf.co.jp Thu Jun 15 23:31:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA06634; - Thu, 15 Jun 2000 23:30:59 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA03227; Thu, 15 Jun 2000 23:18:54 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id MAA07544; Fri, 16 Jun 2000 12:18:06 +0900 -From: "Hiroshi Inoue" -To: "Bruce Momjian" , "Tom Lane" -Cc: "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Fri, 16 Jun 2000 12:20:16 +0900 -Message-ID: <000401bfd741$cabea100$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -In-Reply-To: <200006160224.WAA04345@candle.pha.pa.us> -Importance: Normal -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Status: RO - -> -----Original Message----- -> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] -> -> > "Hiroshi Inoue" writes: -> > > Now I like neither relname nor oid because it's not sufficient -> > > for my purpose. -> > -> > We should probably not do much of anything with this issue until -> > we have a clearer understanding of what we want to do about -> > tablespaces and schemas. -> -> Here is an analysis of our options: -> -> Work required Disadvantages -> ------------------------------------------------------------------ -> ---------- -> -> Keep current system no work rename/create -> no rollback -> -> relname/oid but less work new pg_class column, -> no rename change filename not -> accurate on -> rename -> -> relname/oid with more work complex code -> rename change during -> vacuum -> -> oid filename less work, but confusing to admins -> need admin tools -> - -Please add my opinion for naming rule. - -relname/unique_id but need some work new pg_class column, -no relname change. for unique-id generation filename not relname - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From pgsql-hackers-owner+M3465@hub.org Fri Jun 16 00:01:01 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA06924 - for ; Fri, 16 Jun 2000 00:01:00 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA05470 for ; Thu, 15 Jun 2000 23:59:46 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5G3uaI10809; - Thu, 15 Jun 2000 23:56:36 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5G3uKI10702 - for ; Thu, 15 Jun 2000 23:56:21 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id MAA07571; Fri, 16 Jun 2000 12:55:33 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" -Cc: "PostgreSQL-development" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Fri, 16 Jun 2000 12:57:44 +0900 -Message-ID: <000501bfd747$067f0220$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -In-Reply-To: <3264.961127021@sss.pgh.pa.us> -Importance: Normal -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> "Hiroshi Inoue" writes: -> > Please add my opinion for naming rule. -> -> > relname/unique_id but need some work new -> pg_class column, -> > no relname change. for unique-id generation filename not relname -> -> Why is a unique ID better than --- or even different from --- -> using the relation's OID? It seems pointless to me... -> - -For example,in the implementation of CLUSTER command, -we would need another new file for the target relation in -order to put sorted rows but don't we want to change the -OID ? It would be needed for table re-construction generally. -If I remember correectly,you once proposed OID+version -naming for the cases. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From Inoue@tpf.co.jp Fri Jun 16 02:01:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA08093 - for ; Fri, 16 Jun 2000 02:00:59 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA10174 for ; Fri, 16 Jun 2000 01:34:44 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id OAA07656; Fri, 16 Jun 2000 14:33:12 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" -Cc: "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Fri, 16 Jun 2000 14:35:21 +0900 -Message-ID: <000001bfd754$a9e44f80$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -In-Reply-To: <3238.961126521@sss.pgh.pa.us> -Importance: Normal -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Status: RO - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> "Hiroshi Inoue" writes: -> > Tablespace is an encapsulation of table allocation and the -> > name should be irrevant to the location basically. So above -> > seems very bad for me. -> > Anyway I don't see any advantage in fixed mapping impleme -> > ntation. After renewal,we should at least have a possibility to -> > allocate a specific table in arbitrary separate directory. -> -> Call a "directory" a "tablespace" and we're on the same page, -> aren't we? Actually I'd envision some kind of admin command -> "CREATE TABLESPACE foo AS /path/to/wherever". - -Yes,I think 'tablespace -> directory' is the most natural -extension under current file_per_table storage manager. -If many_tables_in_a_file storage manager is introduced,we -may be able to change the definiiton of TABLESPACE -to 'tablespace -> files' like Oracle. - -> That would make -> appropriate system catalog entries and also create a symlink -> from ".../data/base/foo" (or some such place) to the target -> directory. -> Then when we make a table in that tablespace, -> it's in the right place. Problem solved, no? -> - -I don't like symlink for dbms data files. However it may -be OK,If symlink are limited to 'tablespace->directory' -corrspondence and all tablespaces(including default -etc) are symlink. It is simple and all debugging would -be processed under tablespace_is_symlink environment. - -> It gets a little trickier if you want to be able to split -> multi-gig tables across several tablespaces, though, since -> you couldn't just append ".N" to the base table path in that -> scenario. -> - -This seems to be not that easy to solve now. -Ross doesn't change this naming rule for multi-gig -tables either in his trial. - -> I'd be interested to know what sort of facilities Oracle -> provides for managing huge tables... -> - -In my knowledge about old Oracle,one TABLESPACE -could have many DATAFILEs which could contain -many tables. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From pgsql-hackers-owner+M3469@hub.org Fri Jun 16 02:01:03 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA08109 - for ; Fri, 16 Jun 2000 02:01:02 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA11218 for ; Fri, 16 Jun 2000 01:57:33 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5G5tLI49492; - Fri, 16 Jun 2000 01:55:21 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5G5tAI49395 - for ; Fri, 16 Jun 2000 01:55:10 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id BAA05749; - Fri, 16 Jun 2000 01:54:46 -0400 (EDT) -To: "Hiroshi Inoue" -cc: "PostgreSQL-development" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <000501bfd747$067f0220$2801007e@tpf.co.jp> -References: <000501bfd747$067f0220$2801007e@tpf.co.jp> -Comments: In-reply-to "Hiroshi Inoue" - message dated "Fri, 16 Jun 2000 12:57:44 +0900" -Date: Fri, 16 Jun 2000 01:54:46 -0400 -Message-ID: <5746.961134886@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -"Hiroshi Inoue" writes: ->> Why is a unique ID better than --- or even different from --- ->> using the relation's OID? It seems pointless to me... - -> For example,in the implementation of CLUSTER command, -> we would need another new file for the target relation in -> order to put sorted rows but don't we want to change the -> OID ? It would be needed for table re-construction generally. -> If I remember correectly,you once proposed OID+version -> naming for the cases. - -Hmm, so you are thinking that the pg_class row for the table would -include this uniqueID, and then committing the pg_class update would -be the atomic action that replaces the old table contents with the -new? It does have some attraction now that I think about it. - -But there are other ways we could do the same thing. If we want to -have tablespaces, there will need to be a tablespace identifier in -each pg_class row. So we could do CLUSTER in the same way as we'd -move a table from one tablespace to another: create the new files in -the new tablespace directory, and the commit of the new pg_class row -with the new tablespace value is the atomic action that makes the new -files valid and the old files not. - -You will probably say "but I didn't want to move my table to a new -tablespace just to cluster it!" I think we could live with that, -though. A tablespace doesn't need to have any existence more concrete -than a subdirectory, in my vision of the way things would work. We -could do something like making two subdirectories of each place that -the dbadmin designates as a "tablespace", so that we make two logical -tablespaces out of what the dbadmin thinks of as one. Then we can -ping-pong between those directories to do things like clustering "in -place". - -Basically I want to keep the bottom-level mechanisms as simple and -reliable as we possibly can. The fewer concepts are known down at -the bottom, the better. If we can keep the pathname constituents -to just "tablespace" and "relation OID" we'll be in great shape --- -but each additional concept that has to be known down there is -another potential problem. - - regards, tom lane - -From pgsql-hackers-owner+M3471@hub.org Fri Jun 16 03:31:05 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA12816 - for ; Fri, 16 Jun 2000 03:31:04 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA14405 for ; Fri, 16 Jun 2000 03:03:38 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5G71YI83633; - Fri, 16 Jun 2000 03:01:34 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5G713I82023 - for ; Fri, 16 Jun 2000 03:01:04 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id QAA07731; Fri, 16 Jun 2000 16:00:57 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" -Cc: "PostgreSQL-development" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Fri, 16 Jun 2000 16:03:06 +0900 -Message-ID: <000101bfd760$ebcee3e0$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -In-Reply-To: <5746.961134886@sss.pgh.pa.us> -Importance: Normal -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> "Hiroshi Inoue" writes: -> >> Why is a unique ID better than --- or even different from --- -> >> using the relation's OID? It seems pointless to me... -> -> > For example,in the implementation of CLUSTER command, -> > we would need another new file for the target relation in -> > order to put sorted rows but don't we want to change the -> > OID ? It would be needed for table re-construction generally. -> > If I remember correectly,you once proposed OID+version -> > naming for the cases. -> -> Hmm, so you are thinking that the pg_class row for the table would -> include this uniqueID, - -No,I just include the place where the table is stored(pathname under -current file_per_table storage manager) in the pg_class row because -I don't want to rely on table allocating rule(naming rule for current) -to access existent relation files. This has always been my main point. -Many_tables_in_a_file storage manager wouldn't be able to live without -keeping this kind of infomation. -This information(where it is stored) is diffrent from tablespace(where -to store) information. There was an idea to keep the information into -opaque entry in pg_class which only a specific storage manager -could handle. There was an idea to have a new system table which -keeps the information. and so on... - -> and then committing the pg_class update would -> be the atomic action that replaces the old table contents with the -> new? It does have some attraction now that I think about it. -> -> But there are other ways we could do the same thing. If we want to -> have tablespaces, there will need to be a tablespace identifier in -> each pg_class row. So we could do CLUSTER in the same way as we'd -> move a table from one tablespace to another: create the new files in -> the new tablespace directory, and the commit of the new pg_class row -> with the new tablespace value is the atomic action that makes the new -> files valid and the old files not. -> -> You will probably say "but I didn't want to move my table to a new -> tablespace just to cluster it!" - -Yes. - -> I think we could live with that, -> though. A tablespace doesn't need to have any existence more concrete -> than a subdirectory, in my vision of the way things would work. We -> could do something like making two subdirectories of each place that -> the dbadmin designates as a "tablespace", so that we make two logical -> tablespaces out of what the dbadmin thinks of as one. - -Certainly we could design TABLESPACE(where to store) as above. - -> Then we can -> ping-pong between those directories to do things like clustering "in -> place". -> - -But maybe we must keep the directory information where the table was -*ping-ponged* in (e.g.) pg_class. Is such an implementation cleaner or -more extensible than mine(keeping the stored place exactly) ? - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From pgsql-hackers-owner+M3473@hub.org Fri Jun 16 04:01:12 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA13087 - for ; Fri, 16 Jun 2000 04:01:11 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA16002 for ; Fri, 16 Jun 2000 03:37:24 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5G7ZZI51521; - Fri, 16 Jun 2000 03:35:35 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5G7ZEI51350 - for ; Fri, 16 Jun 2000 03:35:14 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA06103; - Fri, 16 Jun 2000 03:34:47 -0400 (EDT) -To: Chris Bitmead -cc: PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <3949BCC4.8424A58F@nimrod.itg.telecom.com.au> -References: <200006142043.WAA07887@hot.jw.home> <16606.961034835@sss.pgh.pa.us> <3949BCC4.8424A58F@nimrod.itg.telecom.com.au> -Comments: In-reply-to Chris Bitmead - message dated "Fri, 16 Jun 2000 15:36:04 +1000" -Date: Fri, 16 Jun 2000 03:34:47 -0400 -Message-ID: <6100.961140887@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Chris Bitmead writes: -> Tom Lane wrote: ->> I don't see a lot of value in that. Better to do something like ->> tablespaces: ->> ->> // - -> What is the benefit of having oidoftablespace in the directory path? -> Isn't tablespace an idea so you can store it somewhere completely -> different? -> Or is there some symlink idea or something? - -Exactly --- I'm assuming that the tablespace "directory" is likely -to be a symlink to some other mounted volume. The point here is -to keep the low-level file access routines from having to know very -much about tablespaces or file organization. In the above proposal, -all they need to know is the relation's OID and the name (or OID) -of the tablespace the relation's assigned to; then they can form -a valid path using a hardwired rule. There's still plenty of -flexibility of organization, but it's not necessary to know that -where the rubber meets the road (eg, when you're down inside mdblindwrt -trying to dump a dirty buffer to disk with no spare resources to find -out anything about the relation the page belongs to...) - - regards, tom lane - -From JanWieck@t-online.de Fri Jun 16 11:01:06 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA28913 - for ; Fri, 16 Jun 2000 11:01:05 -0400 (EDT) -Received: from mailout05.sul.t-online.com (mailout05.sul.t-online.com [194.25.134.82]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id KAA01818 for ; Fri, 16 Jun 2000 10:46:42 -0400 (EDT) -Received: from fwd06.sul.t-online.de - by mailout05.sul.t-online.com with smtp - id 132xN9-0006ze-03; Fri, 16 Jun 2000 16:45:27 +0200 -Received: from hot.jw.home (340000654369-0001@[62.158.179.251]) by fwd06.sul.t-online.de - with esmtp id 132xMx-0E54HQC; Fri, 16 Jun 2000 16:45:15 +0200 -Received: (from wieck@localhost) - by hot.jw.home (8.8.5/8.8.5) id OAA15163; - Fri, 16 Jun 2000 14:42:12 +0200 -From: JanWieck@t-online.de (Jan Wieck) -Message-Id: <200006161242.OAA15163@hot.jw.home> -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <3238.961126521@sss.pgh.pa.us> from Tom Lane at "Jun 15, 2000 11:35:21 - pm" -To: Tom Lane -Date: Fri, 16 Jun 2000 14:42:12 +0200 (MEST) -CC: Hiroshi Inoue , Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Reply-To: Jan Wieck -X-Mailer: ELM [version 2.4ME+ PL68 (25)] -MIME-Version: 1.0 -Content-Type: text/plain; charset=US-ASCII -Content-Transfer-Encoding: 7bit -X-Sender: 340000654369-0001@t-dialin.net -Status: ROr - -Tom Lane wrote: -> -> It gets a little trickier if you want to be able to split -> multi-gig tables across several tablespaces, though, since -> you couldn't just append ".N" to the base table path in that -> scenario. -> -> I'd be interested to know what sort of facilities Oracle -> provides for managing huge tables... - - Oracle tablespaces are a collection of 1...n preallocated - files. Each table then is bound to a tablespace and - allocates extents (chunks) from those files. - - There are some per table attributes that control the extent - sizes with default values coming from the tablespace. The - initial extent size, the nextextent and the pctincrease. - There is a hardcoded limit for the number of extents a table - can have at all. In Oracle7 it was 512 (or somewhat below - - don't recall correct). Maybe that's gone with Oracle8, don't - know. - - This storage concept has IMHO a couple of advatages over - ours. - - The tablespace files are preallocated, so there will - never be a change in block allocation during runtime and - that's the base for fdatasync() beeing sufficient at - syncpoints. All what might be inaccurate after a crash is - the last modified time in the inode, and that's totally - irrelevant for Oracle. The fsck will never fail, and - anything is up to Oracle's recovery. - - The number of total tablespace files is limited to a - value that ensures, that the backends can keep them all - open all the time. It's hard to exceed that limit. A - typical SAP installation with more than 20,000 - tables/indices doesn't need more than 30 or 40 of them. - - It is perfectly prepared for raw devices, since a - tablespace in a raw device installation is simply an area - of blocks on a disk. - - There are also disadvantages. - - You can run out of space even if there are plenty GB's - free on your disks. You have to create tablespaces - explicitly. - - If you've choosen inadequate extent size parameters, you - end up with high fragmented tables (slowing down) or get - stuck with running against maxextents, where only a reorg - (export/import) helps. - - -Jan - --- - -#======================================================================# -# It's easier to get forgiveness for being wrong than for being right. # -# Let's break this rule - forgive me. # -#================================================== JanWieck@Yahoo.com # - - - -From tgl@sss.pgh.pa.us Fri Jun 16 11:00:40 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA28898 - for ; Fri, 16 Jun 2000 11:00:39 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA07184; - Fri, 16 Jun 2000 11:00:35 -0400 (EDT) -To: Jan Wieck -cc: Hiroshi Inoue , Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006161242.OAA15163@hot.jw.home> -References: <200006161242.OAA15163@hot.jw.home> -Comments: In-reply-to JanWieck@t-online.de (Jan Wieck) - message dated "Fri, 16 Jun 2000 14:42:12 +0200" -Date: Fri, 16 Jun 2000 11:00:35 -0400 -Message-ID: <7181.961167635@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -JanWieck@t-online.de (Jan Wieck) writes: -> There are also disadvantages. - -> You can run out of space even if there are plenty GB's -> free on your disks. You have to create tablespaces -> explicitly. - -Not to mention the reverse: if I read this right, you have to suck -up your GB's long in advance of actually needing them. That's OK -for a machine that's dedicated to Oracle ... not so OK for smaller -installations, playpens, etc. - -I'm not convinced that there's anything fundamentally wrong with -doing storage allocation in Unix files the way we have been. - -(At least not when we're sitting atop a well-done filesystem, -which may leave the Linux folk out in the cold ;-).) - - regards, tom lane - -From tgl@sss.pgh.pa.us Fri Jun 16 12:01:03 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA29853 - for ; Fri, 16 Jun 2000 12:01:02 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA08255 for ; Fri, 16 Jun 2000 11:48:10 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA07461; - Fri, 16 Jun 2000 11:46:41 -0400 (EDT) -To: Jan Wieck -cc: Hiroshi Inoue , Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006161242.OAA15163@hot.jw.home> -References: <200006161242.OAA15163@hot.jw.home> -Comments: In-reply-to JanWieck@t-online.de (Jan Wieck) - message dated "Fri, 16 Jun 2000 14:42:12 +0200" -Date: Fri, 16 Jun 2000 11:46:41 -0400 -Message-ID: <7458.961170401@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -JanWieck@t-online.de (Jan Wieck) writes: -> Tom Lane wrote: ->> It gets a little trickier if you want to be able to split ->> multi-gig tables across several tablespaces, though, since ->> you couldn't just append ".N" to the base table path in that ->> scenario. ->> ->> I'd be interested to know what sort of facilities Oracle ->> provides for managing huge tables... - -> Oracle tablespaces are a collection of 1...n preallocated -> files. Each table then is bound to a tablespace and -> allocates extents (chunks) from those files. - -OK, to get back to the point here: so in Oracle, tables can't cross -tablespace boundaries, but a tablespace itself could span multiple -disks? - -Not sure if I like that better or worse than equating a tablespace -with a directory (so, presumably, all the files within it live on -one filesystem) and then trying to make tables able to span -tablespaces. We will need to do one or the other though, if we want -to have any significant improvement over the current state of affairs -for large tables. - -One way is to play the flip-the-path-ordering game some more, -and access multiple-segment tables with pathnames like this: - - .../TABLESPACE/RELATION -- first or only segment - .../TABLESPACE/N/RELATION -- N'th extension segment - -This isn't any harder for md.c to deal with than what we do now, -but by making the /N subdirectories be symlinks, the dbadmin could -easily arrange for extension segments to go on different filesystems. -Also, since /N subdirectory symlinks can be added as needed, -expanding available space by attaching more disks isn't hard. -(If the admin hasn't pre-made a /N symlink when it's needed, -I'd envision the backend just automatically creating a plain -subdirectory so that it can extend the table.) - -A limitation is that the N'th extension segments of all the relations -in a given tablespace have to be in the same place, but I don't see -that as a major objection. Worst case is you make a separate tablespace -for each of your multi-gig relations ... you're probably not going to -have a very large number of such relations, so this doesn't seem like -unmanageable admin complexity. - -We'd still want to create some tools to help the dbadmin with slinging -all these symlinks around, of course. But I think it's critical to keep -the low-level file access protocol simple and reliable, which really -means minimizing the amount of information the backend needs to know to -figure out which file to write a page in. With something like the above -you only need to know the tablespace name (or more likely OID), the -relation OID (+name or not, depending on outcome of other argument), -and the offset in the table. No worse than now from the software's -point of view. - -Comments? - - regards, tom lane - -From lockhart@alumni.caltech.edu Fri Jun 16 12:31:50 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA00649 - for ; Fri, 16 Jun 2000 12:31:49 -0400 (EDT) -Received: from huey.jpl.nasa.gov (huey.jpl.nasa.gov [128.149.68.100]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA13118 for ; Fri, 16 Jun 2000 12:31:52 -0400 (EDT) -Received: from golem.jpl.nasa.gov (hectic-1 [128.149.68.203]) - by huey.jpl.nasa.gov (8.8.8+Sun/8.8.8) with ESMTP id JAA15007; - Fri, 16 Jun 2000 09:27:18 -0700 (PDT) -Received: from alumni.caltech.edu (localhost.localdomain [127.0.0.1]) - by golem.jpl.nasa.gov (Postfix) with ESMTP - id DD8426F51; Fri, 16 Jun 2000 16:27:22 +0000 (UTC) -Sender: lockhart@mythos.jpl.nasa.gov -Message-ID: <394A556A.4EAC8B9A@alumni.caltech.edu> -Date: Fri, 16 Jun 2000 16:27:22 +0000 -From: Thomas Lockhart -Organization: Yes -X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.14-15mdksmp i686) -X-Accept-Language: en -MIME-Version: 1.0 -To: Tom Lane -Cc: Jan Wieck , Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -References: <200006161242.OAA15163@hot.jw.home> <7458.961170401@sss.pgh.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: RO - -> ... But I think it's critical to keep -> the low-level file access protocol simple and reliable, which really -> means minimizing the amount of information the backend needs to know -> to figure out which file to write a page in. With something like the -> above you only need to know the tablespace name (or more likely OID), -> the relation OID (+name or not, depending on outcome of other -> argument), and the offset in the table. No worse than now from the -> software's point of view. -> Comments? - -I'm probably missing the context a bit, but imho we should try hard to -stay away from symlinks as the general solution for anything. - -Sorry for being behind here, but to make sure I'm on the right page: -o tablespaces decouple storage from logical tables -o a database lives in a default tablespace, unless specified -o by default, a table will live in the default tablespace -o (eventually) a table can be split across tablespaces - -Some thoughts: -o the ability to split single tables across disks was essential for -scalability when disks were small. But with RAID, NAS, etc etc isn't -that a smaller issue now? -o "tablespaces" would implement our less-developed "with location" -feature, right? Splitting databases, whole indices and whole tables -across storage is the biggest win for this work since more users will -use the feature. -o location information needs to travel with individual tables anyway. - -From scrappy@hub.org Fri Jun 16 13:01:02 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01191; - Fri, 16 Jun 2000 13:01:01 -0400 (EDT) -Received: from thelab.hub.org (nat193.152.mpoweredpc.net [142.177.193.152]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA15282; Fri, 16 Jun 2000 12:53:23 -0400 (EDT) -Received: from localhost (scrappy@localhost) - by thelab.hub.org (8.9.3/8.9.3) with ESMTP id NAA28326; - Fri, 16 Jun 2000 13:50:37 -0300 (ADT) - (envelope-from scrappy@hub.org) -X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs -Date: Fri, 16 Jun 2000 13:50:37 -0300 (ADT) -From: The Hermit Hacker -To: Bruce Momjian -cc: Tom Lane , Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <200006160224.WAA04345@candle.pha.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=US-ASCII -Status: RO - -On Thu, 15 Jun 2000, Bruce Momjian wrote: - -> > "Hiroshi Inoue" writes: -> > > Now I like neither relname nor oid because it's not sufficient -> > > for my purpose. -> > -> > We should probably not do much of anything with this issue until -> > we have a clearer understanding of what we want to do about -> > tablespaces and schemas. -> -> Here is an analysis of our options: -> -> Work required Disadvantages -> ---------------------------------------------------------------------------- -> -> Keep current system no work rename/create no rollback -> -> relname/oid but less work new pg_class column, -> no rename change filename not accurate on -> rename -> -> relname/oid with more work complex code -> rename change during -> vacuum -> -> oid filename less work, but confusing to admins -> need admin tools - -My vote is with Tom on this one ... oid only ... the admin should be able -to do a quick SELECT on a table to find out the OID->table mapping, and I -believe its already been pointed out that you cant' just restore one file -anyway, so it kinda negates the "server isn't running problem" ... - - - - -From tgl@sss.pgh.pa.us Fri Jun 16 13:01:01 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01188 - for ; Fri, 16 Jun 2000 13:01:01 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA15530 for ; Fri, 16 Jun 2000 12:55:38 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA07750; - Fri, 16 Jun 2000 12:54:00 -0400 (EDT) -To: Thomas Lockhart -cc: Jan Wieck , Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <394A556A.4EAC8B9A@alumni.caltech.edu> -References: <200006161242.OAA15163@hot.jw.home> <7458.961170401@sss.pgh.pa.us> <394A556A.4EAC8B9A@alumni.caltech.edu> -Comments: In-reply-to Thomas Lockhart - message dated "Fri, 16 Jun 2000 16:27:22 -0000" -Date: Fri, 16 Jun 2000 12:54:00 -0400 -Message-ID: <7747.961174440@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Thomas Lockhart writes: ->> ... But I think it's critical to keep ->> the low-level file access protocol simple and reliable, which really ->> means minimizing the amount of information the backend needs to know ->> to figure out which file to write a page in. With something like the ->> above you only need to know the tablespace name (or more likely OID), ->> the relation OID (+name or not, depending on outcome of other ->> argument), and the offset in the table. No worse than now from the ->> software's point of view. ->> Comments? - -> I'm probably missing the context a bit, but imho we should try hard to -> stay away from symlinks as the general solution for anything. - -Why? - - regards, tom lane - -From dhogaza@pacifier.com Fri Jun 16 14:55:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA02086 - for ; Fri, 16 Jun 2000 14:54:59 -0400 (EDT) -Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id OAA26430 for ; Fri, 16 Jun 2000 14:40:00 -0400 (EDT) -Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68]) - by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id LAA08661; - Fri, 16 Jun 2000 11:38:36 -0700 (PDT) -Message-Id: <3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com> -X-Sender: dhogaza@mail.pacifier.com -X-Mailer: Windows Eudora Pro Version 3.0.1 (32) -Date: Fri, 16 Jun 2000 10:50:23 -0700 -To: Tom Lane , Jan Wieck -From: Don Baccus -Subject: Re: [HACKERS] Big 7.1 open items -Cc: Hiroshi Inoue , Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -In-Reply-To: <7458.961170401@sss.pgh.pa.us> -References: <200006161242.OAA15163@hot.jw.home> - <200006161242.OAA15163@hot.jw.home> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -Status: RO - -At 11:46 AM 6/16/00 -0400, Tom Lane wrote: - ->OK, to get back to the point here: so in Oracle, tables can't cross ->tablespace boundaries, - -Right, the construct AFAIK is "create table/index foo on tablespace ..." - -> but a tablespace itself could span multiple ->disks? - -Right. - ->Not sure if I like that better or worse than equating a tablespace ->with a directory (so, presumably, all the files within it live on ->one filesystem) and then trying to make tables able to span ->tablespaces. We will need to do one or the other though, if we want ->to have any significant improvement over the current state of affairs ->for large tables. - -Oracle's way does a reasonable job of isolating the datamodel -from the details of the physical layout. - -Take the OpenACS web toolkit, for instance. We could take -each module's tables and indices and assign them appropriately -to various dataspaces, then provide a separate .sql files with -only "create tablespace" statements in there. - -By modifying that one central file, the toolkit installation -could be customized to run anything from a small site (one -disk with everything on it, ala my own personal webserver at -birdnotes.net) or a very large site with many spindles, with -various index and table structures spread out widely hither -and thither. - -Given that the OpenACS datamodel is nearly 10K lines long (including -many comments, of course), being able to customize an installation -to such a degree by modifying a single file filled with "create -tablespaces" would be very attractive. - ->One way is to play the flip-the-path-ordering game some more, ->and access multiple-segment tables with pathnames like this: -> -> .../TABLESPACE/RELATION -- first or only segment -> .../TABLESPACE/N/RELATION -- N'th extension segment -> ->This isn't any harder for md.c to deal with than what we do now, ->but by making the /N subdirectories be symlinks, the dbadmin could ->easily arrange for extension segments to go on different filesystems. - -I personally dislike depending on symlinks to move stuff around. -Among other things, a pg_dump/restore (and presumably future -backup tools?) can't recreate the disk layout automatically. - ->We'd still want to create some tools to help the dbadmin with slinging ->all these symlinks around, of course. - -OK, if symlinks are simply an implementation detail hidden from the -dbadmin, and if the physical structure is kept in the db so it can -be rebuilt if necessary automatically, then I don't mind symlinks. - -> But I think it's critical to keep ->the low-level file access protocol simple and reliable, which really ->means minimizing the amount of information the backend needs to know to ->figure out which file to write a page in. With something like the above ->you only need to know the tablespace name (or more likely OID), the ->relation OID (+name or not, depending on outcome of other argument), ->and the offset in the table. No worse than now from the software's ->point of view. - -Make the code that creates and otherwise manipulates tablespaces -do the work, while keeping the low-level file access protocol simple. - -Yes, this approach sounds very good to me. - - - -- Don Baccus, Portland OR - Nature photos, on-line guides, Pacific Northwest - Rare Bird Alert Service and other goodies at - http://donb.photo.net. - -From pgsql-hackers-owner+M3500@hub.org Fri Jun 16 14:55:10 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA02107 - for ; Fri, 16 Jun 2000 14:55:09 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id OAA26943 for ; Fri, 16 Jun 2000 14:44:12 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5GIelM05972; - Fri, 16 Jun 2000 14:40:47 -0400 (EDT) -Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5GIe5M05692 - for ; Fri, 16 Jun 2000 14:40:05 -0400 (EDT) -Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68]) - by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id LAA08667; - Fri, 16 Jun 2000 11:38:41 -0700 (PDT) -Message-Id: <3.0.1.32.20000616111435.01a17a10@mail.pacifier.com> -X-Sender: dhogaza@mail.pacifier.com -X-Mailer: Windows Eudora Pro Version 3.0.1 (32) -Date: Fri, 16 Jun 2000 11:14:35 -0700 -To: Thomas Lockhart , - Tom Lane -From: Don Baccus -Subject: Re: [HACKERS] Big 7.1 open items -Cc: Jan Wieck , Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -In-Reply-To: <394A556A.4EAC8B9A@alumni.caltech.edu> -References: <200006161242.OAA15163@hot.jw.home> - <7458.961170401@sss.pgh.pa.us> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -At 04:27 PM 6/16/00 +0000, Thomas Lockhart wrote: - ->Sorry for being behind here, but to make sure I'm on the right page: ->o tablespaces decouple storage from logical tables ->o a database lives in a default tablespace, unless specified ->o by default, a table will live in the default tablespace ->o (eventually) a table can be split across tablespaces - -Or tablespaces across filesystems/mountpoints whatever. - ->Some thoughts: ->o the ability to split single tables across disks was essential for ->scalability when disks were small. But with RAID, NAS, etc etc isn't ->that a smaller issue now? - -Yes for size issues, I should think, especially if you have the -money for a large RAID subsystem. But for throughput performance, -control over which spindles particularly busy tables and indices -go on would still seem to be pretty relevant, when they're being -updated a lot. In order to minimize seek times. - -I really can't say how important this is in reality. Oracle-world -folks still talk about this kind of optimization being important, -but I'm not personally running any kind of database-backed website -that's busy enough or contains enough storage to worry about it. - ->o "tablespaces" would implement our less-developed "with location" ->feature, right? Splitting databases, whole indices and whole tables ->across storage is the biggest win for this work since more users will ->use the feature. ->o location information needs to travel with individual tables anyway. - - - -- Don Baccus, Portland OR - Nature photos, on-line guides, Pacific Northwest - Rare Bird Alert Service and other goodies at - http://donb.photo.net. - -From tgl@sss.pgh.pa.us Fri Jun 16 15:00:55 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA02397 - for ; Fri, 16 Jun 2000 15:00:54 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id PAA08247; - Fri, 16 Jun 2000 15:00:11 -0400 (EDT) -To: Don Baccus -cc: Jan Wieck , Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com> -References: <200006161242.OAA15163@hot.jw.home> <200006161242.OAA15163@hot.jw.home> <3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com> -Comments: In-reply-to Don Baccus - message dated "Fri, 16 Jun 2000 10:50:23 -0700" -Date: Fri, 16 Jun 2000 15:00:10 -0400 -Message-ID: <8244.961182010@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Don Baccus writes: ->> This isn't any harder for md.c to deal with than what we do now, ->> but by making the /N subdirectories be symlinks, the dbadmin could ->> easily arrange for extension segments to go on different filesystems. - -> I personally dislike depending on symlinks to move stuff around. -> Among other things, a pg_dump/restore (and presumably future -> backup tools?) can't recreate the disk layout automatically. - -Good point, we'd need some way of saving/restoring the tablespace -structures. - ->> We'd still want to create some tools to help the dbadmin with slinging ->> all these symlinks around, of course. - -> OK, if symlinks are simply an implementation detail hidden from the -> dbadmin, and if the physical structure is kept in the db so it can -> be rebuilt if necessary automatically, then I don't mind symlinks. - -I'm not sure about keeping it in the db --- creates a bit of a -chicken-and-egg problem doesn't it? Maybe there needs to be a -"system database" that has nailed-down pathnames (no tablespaces -for you baby) and contains the critical installation-wide tables -like pg_database, pg_user, pg_tablespace. A restore would have -to restore these tables first anyway. - -> Make the code that creates and otherwise manipulates tablespaces -> do the work, while keeping the low-level file access protocol simple. - -Right, that's the bottom line for me. - - regards, tom lane - -From reedstrm@rice.edu Fri Jun 16 16:51:50 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA03689 - for ; Fri, 16 Jun 2000 16:51:49 -0400 (EDT) -Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id PAA03409 for ; Fri, 16 Jun 2000 15:48:40 -0400 (EDT) -Received: by rice.edu - via sendmail from stdin - id (Debian Smail3.2.0.102) - for maillist@candle.pha.pa.us; Fri, 16 Jun 2000 14:35:28 -0500 (CDT) -Date: Fri, 16 Jun 2000 14:35:28 -0500 -From: "Ross J. Reedstrom" -To: Thomas Lockhart -Cc: Tom Lane , Jan Wieck , - Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -Message-ID: <20000616143528.A28920@rice.edu> -Mail-Followup-To: Thomas Lockhart , - Tom Lane , Jan Wieck , - Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development -References: <200006161242.OAA15163@hot.jw.home> <7458.961170401@sss.pgh.pa.us> <394A556A.4EAC8B9A@alumni.caltech.edu> -Mime-Version: 1.0 -Content-Type: text/plain; charset=iso-8859-1 -Content-Transfer-Encoding: 8bit -User-Agent: Mutt/1.0i -In-Reply-To: <394A556A.4EAC8B9A@alumni.caltech.edu>; from lockhart@alumni.caltech.edu on Fri, Jun 16, 2000 at 04:27:22PM +0000 -Status: RO - -On Fri, Jun 16, 2000 at 04:27:22PM +0000, Thomas Lockhart wrote: -> > ... But I think it's critical to keep -> > the low-level file access protocol simple and reliable, which really -> > means minimizing the amount of information the backend needs to know -> > to figure out which file to write a page in. With something like the -> > above you only need to know the tablespace name (or more likely OID), -> > the relation OID (+name or not, depending on outcome of other -> > argument), and the offset in the table. No worse than now from the -> > software's point of view. -> > Comments? - -I think the backend needs a per table token that indicates how -to get at the physical bits of the file. Whether that's a filename -alone, filename with path, oid, key to a smgr hash table or something -else, it's opaque above the smgr routines. - -Hmm, now I'm thinking, since the tablespace discussion has been reopened, -the way to go about coding all this is to reactivate the smgr code: how -about I leave the existing md smgr as is, and clone it, call it md2 or -something, and start messing with adding features there? - - -> -> I'm probably missing the context a bit, but imho we should try hard to -> stay away from symlinks as the general solution for anything. -> -> Sorry for being behind here, but to make sure I'm on the right page: -> o tablespaces decouple storage from logical tables -> o a database lives in a default tablespace, unless specified -> o by default, a table will live in the default tablespace -> o (eventually) a table can be split across tablespaces -> -> Some thoughts: -> o the ability to split single tables across disks was essential for -> scalability when disks were small. But with RAID, NAS, etc etc isn't -> that a smaller issue now? -> o "tablespaces" would implement our less-developed "with location" -> feature, right? Splitting databases, whole indices and whole tables -> across storage is the biggest win for this work since more users will -> use the feature. -> o location information needs to travel with individual tables anyway. - -I was juist thinking that that discussion needed some summation. - -Some links to historic discussion: - -This one is Vadim saying WAL will need oids names: -http://www.postgresql.org/mhonarc/pgsql-hackers/1999-11/msg00809.html - -A longer discussion kicked off by Don Baccus: -http://www.postgresql.org/mhonarc/pgsql-hackers/2000-01/msg00510.html - -Tom suggesting OIDs to allow rollback: -http://www.postgresql.org/mhonarc/pgsql-hackers/2000-03/msg00119.html - - -Martin Neumann posted an question on dataspaces: - -(can't find it in the offical archives: looks like March 2000, 10-29 is -missing. here's my copy: don't beat on it! n particular, since I threw -it together for local access, it's one _big_ index page) - -http://cooker.ir.rice.edu/postgresql/msg20257.html -(in that thread is a post where I mention blindwrites and getting rid -of GetRawDatabaseInfo) - -Martin later posted an RFD on tablespaces: - -http://cooker.ir.rice.edu/postgresql/msg20490.html - -Here's Horák Daniel with a patch for discussion, implementing dataspaces -on a per database level: - -http://cooker.ir.rice.edu/postgresql/msg20498.html - -Ross --- -Ross J. Reedstrom, Ph.D., -NSBRI Research Scientist/Programmer -Computer and Information Technology Institute -Rice University, 6100 S. Main St., Houston, TX 77005 - -From dhogaza@pacifier.com Fri Jun 16 16:51:51 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA03692 - for ; Fri, 16 Jun 2000 16:51:50 -0400 (EDT) -Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id PAA02911 for ; Fri, 16 Jun 2000 15:43:13 -0400 (EDT) -Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68]) - by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id MAA11003; - Fri, 16 Jun 2000 12:41:50 -0700 (PDT) -Message-Id: <3.0.1.32.20000616123736.01a19910@mail.pacifier.com> -X-Sender: dhogaza@mail.pacifier.com -X-Mailer: Windows Eudora Pro Version 3.0.1 (32) -Date: Fri, 16 Jun 2000 12:37:36 -0700 -To: Tom Lane -From: Don Baccus -Subject: Re: [HACKERS] Big 7.1 open items -Cc: Jan Wieck , Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -In-Reply-To: <8244.961182010@sss.pgh.pa.us> -References: <3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com> - <200006161242.OAA15163@hot.jw.home> - <200006161242.OAA15163@hot.jw.home> - <3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -Status: RO - -At 03:00 PM 6/16/00 -0400, Tom Lane wrote: - ->> OK, if symlinks are simply an implementation detail hidden from the ->> dbadmin, and if the physical structure is kept in the db so it can ->> be rebuilt if necessary automatically, then I don't mind symlinks. -> ->I'm not sure about keeping it in the db --- creates a bit of a ->chicken-and-egg problem doesn't it? - -Not if the tablespace creates preceeds the tables stored in them. - -> Maybe there needs to be a ->"system database" that has nailed-down pathnames (no tablespaces ->for you baby) and contains the critical installation-wide tables ->like pg_database, pg_user, pg_tablespace. A restore would have ->to restore these tables first anyway. - -Oh, I see. Yes, when I've looked into this and have thought about -it I've assumed that there would always be a known starting point -which would contain the installation-wide tables. - ->From a practical point of view, I don't think that's really a -problem. - -I've not looked into how Oracle does this, I assume it builds -a system tablespace on one of the initial mount points you give -it when you install the thing. The paths to the mount points -are stored in specific files known to Oracle, I think. It's -been over a year (not long enough!) since I've set up Oracle... - - - - -- Don Baccus, Portland OR - Nature photos, on-line guides, Pacific Northwest - Rare Bird Alert Service and other goodies at - http://donb.photo.net. - -From pgsql-hackers-owner+M3512@hub.org Fri Jun 16 17:31:04 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA04168 - for ; Fri, 16 Jun 2000 17:31:03 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id RAA12122 for ; Fri, 16 Jun 2000 17:09:28 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5GL7WM02231; - Fri, 16 Jun 2000 17:07:32 -0400 (EDT) -Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5GL7EM02150 - for ; Fri, 16 Jun 2000 17:07:14 -0400 (EDT) -Received: by rice.edu - via sendmail from stdin - id (Debian Smail3.2.0.102) - for pgsql-hackers@postgresql.org; Fri, 16 Jun 2000 16:07:13 -0500 (CDT) -Date: Fri, 16 Jun 2000 16:07:13 -0500 -From: "Ross J. Reedstrom" -To: Tom Lane -Cc: pgsql-hackers@postgresql.org -Subject: Re: [HACKERS] Big 7.1 open items -Message-ID: <20000616160713.A30793@rice.edu> -Mail-Followup-To: Tom Lane , - pgsql-hackers@postgresql.org -References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu> <18798.961053112@sss.pgh.pa.us> <20000615114519.B3939@rice.edu> <2260.961113232@sss.pgh.pa.us> -Mime-Version: 1.0 -Content-Type: text/plain; charset=us-ascii -User-Agent: Mutt/1.0i -In-Reply-To: <2260.961113232@sss.pgh.pa.us>; from tgl@sss.pgh.pa.us on Thu, Jun 15, 2000 at 07:53:52PM -0400 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -On Thu, Jun 15, 2000 at 07:53:52PM -0400, Tom Lane wrote: -> "Ross J. Reedstrom" writes: -> > On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote: -> >> "Ross J. Reedstrom" writes: -> >>>> Any strong objections to the mixed relname_oid solution? -> >> -> >> Yes! -> -> > The plan here was to let VACUUM handle renaming the file, since it -> > will already have all the necessary locks. This shortens the window -> > of confusion. ALTER TABLE RENAME doesn't happen that often, really - -> > the relname is there just for human consumption, then. -> -> Yeah, I've seen tons of discussion of how if we do this, that, and -> the other thing, and be prepared to fix up some other things in case -> of crash recovery, we can make it work with filename == relname + OID -> (where relname tracks logical name, at least at some remove). -> -> Probably. Assuming nobody forgets anything. - -I agree, it seems a major undertaking, at first glance. And second. Even -third. Especially for someone who hasn't 'earned his spurs' yet. as -it were. - -> I'm just trying to point out that that's a huge amount of pretty -> delicate mechanism. The amount of work required to make it trustworthy -> looks to me to dwarf the admin tools that Bruce is complaining about. -> And we only have a few people competent to do the work. (With all -> due respect, Ross, if you weren't already aware of the implications -> for mdblindwrt, I have to wonder what else you missed.) - -Ah, you knew that comment would come back to haunt me (I have a -tendency to think out loud, even if checking and coming back latter -would be better;-) In fact, there's no problem, and never was, since the -buffer->blind.relname is filled in via RelationGetPhysicalRelationName, -just like every other path that requires direct file access. I just -didn't remember that I had in fact checked it (it's been a couple months, -and I just got back from vacation ;-) - -Actually, Once I re-checked it, the code looked very familiar. I had -spent time looking at the blind write code in the context of getting -rid of the only non-startup use of GetRawDatabaseInfo. - -As to missing things: I'm leaning heavily on Bruce's previous -work for temp tables, to seperate the two uses of relname, via the -RelationGetRelationName and RelationGetPhysicalRelationName. There are -102 uses of the first in the current code (many in elog messages), and -only 11 of the second. If I'd had to do the original work of finding -every use of relname, and catagorizing it, I agree I'm not (yet) up to -it, but I have more confidence in Bruce's (already tested) work. - -> -> Filename == OID is so simple, reliable, and straightforward by -> comparison that I think the decision is a no-brainer. -> - -Perhaps. Changing the label of the file on disk still requires finding -all the code that assumes it knows what that name is, and changing it. -Same work. - -> If we could afford to sink unlimited time into this one issue then -> it might make sense to do it the hard way, but we have enough -> important stuff on our TODO list to keep us all busy for years --- -> I cannot believe that it's an effective use of our time to do this. -> - -The joys of Open Development. You've spent a fair amount of time trying -to convince _me_ not to waste my time. Thanks, but I'm pretty bull headed -sometimes. Since I've already done something of the work, take a look -at what I've got, and then tell me I'm wasting my time, o.k.? - -> -> > Hmm, what's all this with functions in catalog.c that are only called by -> > smgr/md.c? seems to me that anything having to do with physical storage -> > (like the path!) belongs in the smgr abstraction. -> -> Yeah, there's a bunch of stuff that should have been implemented by -> adding new smgr entry points, but wasn't. It should be pushed down. -> (I can't resist pointing out that one of those things is physical -> relation rename, which will go away and not *need* to be pushed down -> if we do it the way I want.) -> - -Oh, I agree completely. In fact, As I said to Hiroshi last time this came -up, I think of the field in pg_class an an opaque token, to be filled in -by the smgr, and only used by code further up to hand back to the smgr -routines. Same should be true of the buffer->blind struct. - -Ross --- -Ross J. Reedstrom, Ph.D., -NSBRI Research Scientist/Programmer -Computer and Information Technology Institute -Rice University, 6100 S. Main St., Houston, TX 77005 - - -From Inoue@tpf.co.jp Fri Jun 16 19:31:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA05334 - for ; Fri, 16 Jun 2000 19:30:59 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA19834 for ; Fri, 16 Jun 2000 19:09:59 -0400 (EDT) -Received: from mcadnote1 (ppm122.noc.fukui.nsk.ne.jp [210.161.188.41]) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id IAA08210; Sat, 17 Jun 2000 08:08:15 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" , "Jan Wieck" -Cc: "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Sat, 17 Jun 2000 08:11:08 +0900 -Message-ID: -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) -In-Reply-To: <7181.961167635@sss.pgh.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 -Importance: Normal -Status: RO - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> JanWieck@t-online.de (Jan Wieck) writes: -> > There are also disadvantages. -> -> > You can run out of space even if there are plenty GB's -> > free on your disks. You have to create tablespaces -> > explicitly. -> -> Not to mention the reverse: if I read this right, you have to suck -> up your GB's long in advance of actually needing them. That's OK -> for a machine that's dedicated to Oracle ... not so OK for smaller -> installations, playpens, etc. -> - -I've had an anxiety about the way like Oracle's preallocation. -It had not been easy for me to estimate the extent size in -Oracle. Maybe it would lose the simplicity of environment -settings which is one of the biggest advantage of PostgreSQL. -It seems that we should also provide not_preallocated DATAFILE -when many_tables_in_a_file storage manager is introduced. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - - -From tgl@sss.pgh.pa.us Fri Jun 16 19:31:01 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA05337 - for ; Fri, 16 Jun 2000 19:31:00 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA20335 for ; Fri, 16 Jun 2000 19:18:26 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA09274; - Fri, 16 Jun 2000 19:16:37 -0400 (EDT) -To: "Ross J. Reedstrom" -cc: Thomas Lockhart , - Jan Wieck , Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <20000616143528.A28920@rice.edu> -References: <200006161242.OAA15163@hot.jw.home> <7458.961170401@sss.pgh.pa.us> <394A556A.4EAC8B9A@alumni.caltech.edu> <20000616143528.A28920@rice.edu> -Comments: In-reply-to "Ross J. Reedstrom" - message dated "Fri, 16 Jun 2000 14:35:28 -0500" -Date: Fri, 16 Jun 2000 19:16:37 -0400 -Message-ID: <9271.961197397@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -"Ross J. Reedstrom" writes: -> I think the backend needs a per table token that indicates how -> to get at the physical bits of the file. Whether that's a filename -> alone, filename with path, oid, key to a smgr hash table or something -> else, it's opaque above the smgr routines. - -Except to the commands that provide the user interface for tablespaces -and so forth. And there aren't all that many places that deal with -physical filenames anyway. It would be a good idea to try to be a -little stricter about this, but I'm not sure you can make the separation -a whole lot cleaner than it is now ... with the exception of the obvious -bogosities like "rename table" being done above the smgr level. (But, -as I said, I want to see that code go away, not just get moved into -smgr...) - -> Hmm, now I'm thinking, since the tablespace discussion has been reopened, -> the way to go about coding all this is to reactivate the smgr code: how -> about I leave the existing md smgr as is, and clone it, call it md2 or -> something, and start messing with adding features there? - -Um, well, you can't have it both ways. If you're going to change/fix -the assumptions of code above the smgr, then you've got to update md -at the same time to match your new definition of the smgr interface. -Won't do much good to have a playpen smgr if the "standard" one is -broken. - -One thing I have been thinking would be a good idea is to take the -relcache out of the bufmgr/smgr interfaces. The relcache is a -higher-level concept and ought not be known to bufmgr or smgr; they -ought to work with some low-level data structure or token for relations. -We might be able to eliminate the whole concept of "blind write" if we -do that. There are other problems with the relcache dependency: entries -in relcache can get blown away at inopportune times due to shared cache -inval, and it doesn't provide a good home for tokens for multiple -"versions" of a relation if we go with the fill-a-new-physical-file -approach to CLUSTER and so on. - -Hmm, if you replace relcache in the smgr interfaces with pointers to -an smgr-maintained data structure, that might be the same thing that -you are alluding to above about an smgr hash table. - -One thing *not* to do is add yet a third layer of data structure on -top of the ones already maintained in fd.c and md.c. Whatever extra -data might be needed here should be added to md.c's tables, I think, -and then the tokens used in the smgr interface would be pointers into -that table. - - regards, tom lane - -From tgl@sss.pgh.pa.us Fri Jun 16 19:30:43 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA05329 - for ; Fri, 16 Jun 2000 19:30:41 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA09320; - Fri, 16 Jun 2000 19:30:26 -0400 (EDT) -To: "Hiroshi Inoue" -cc: "Jan Wieck" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: -References: -Comments: In-reply-to "Hiroshi Inoue" - message dated "Sat, 17 Jun 2000 08:11:08 +0900" -Date: Fri, 16 Jun 2000 19:30:25 -0400 -Message-ID: <9317.961198225@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -"Hiroshi Inoue" writes: -> It seems that we should also provide not_preallocated DATAFILE -> when many_tables_in_a_file storage manager is introduced. - -Several people in this thread have been talking like a -single-physical-file storage manager is in our future, but I can't -recall anyone saying that they were going to do such a thing or even -presenting reasons why it'd be a good idea. - -Seems to me that physical file per relation is considerably better for -our purposes. It's easier to figure out what's going on for admin and -debug work, it means less lock contention among different backends -appending concurrently to different relations, and it gives the OS a -better shot at doing effective read-ahead on sequential scans. - -So why all the enthusiasm for multi-tables-per-file? - - regards, tom lane - -From chris@bitmead.com Fri Jun 16 21:01:02 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07578; - Fri, 16 Jun 2000 21:01:00 -0400 (EDT) -Received: from tech.com.au (IDENT:root@techpt.lnk.telstra.net [139.130.75.122]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA24724; Fri, 16 Jun 2000 20:39:30 -0400 (EDT) -Received: from bitmead.com (IDENT:chris@tardis [203.41.180.243]) - by tech.com.au (8.9.3/8.9.3) with ESMTP id KAA21388; - Sat, 17 Jun 2000 10:39:21 +1000 -Sender: chris@tech.com.au -Message-ID: <394AC8B4.C5B4CCFB@bitmead.com> -Date: Sat, 17 Jun 2000 10:39:16 +1000 -From: Chris Bitmead -X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.14-5.0 i686) -X-Accept-Language: en -MIME-Version: 1.0 -To: Bruce Momjian -CC: Tom Lane , Hiroshi Inoue , - Jan Wieck , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -References: <200006170008.UAA06798@candle.pha.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: RO - - -> > So why all the enthusiasm for multi-tables-per-file? - -It allows you to use raw partitions which stop the OS double buffering -and wasting half of memory, as well as removing the overhead of indirect -blocks in the file system. - -From Inoue@tpf.co.jp Sat Jun 17 06:00:59 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA22177; - Sat, 17 Jun 2000 06:00:59 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id FAA21759; Sat, 17 Jun 2000 05:36:27 -0400 (EDT) -Received: from mcadnote1 (ppm130.noc.fukui.nsk.ne.jp [210.161.188.49]) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id SAA08383; Sat, 17 Jun 2000 18:35:36 +0900 -From: "Hiroshi Inoue" -To: "Bruce Momjian" , "Tom Lane" -Cc: "Jan Wieck" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Sat, 17 Jun 2000 18:38:29 +0900 -Message-ID: -MIME-Version: 1.0 -Content-Type: text/plain; - charset="US-ASCII" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) -In-Reply-To: <200006170008.UAA06798@candle.pha.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 -Importance: Normal -Status: RO - -> -----Original Message----- -> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] -> > -> > So why all the enthusiasm for multi-tables-per-file? -> -> No idea. I thought Vadim mentioned it, but I am not sure anymore. I -> certainly like our current system. -> - -Oops,I'm not so enthusiastic for multi_tables_per_file smgr. -I believe that Ross and I have taken a practical way that doesn't -break current file_per_table smgr. - -However it seems very natural to take multi_tables_per_file -smgr into account when we consider TABLESPACE concept. -Because TABLESPACE is an encapsulation,it should have -a possibility to handle multi_tables_per_file smgr IMHO. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From tgl@sss.pgh.pa.us Sat Jun 17 12:31:08 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA02794; - Sat, 17 Jun 2000 12:31:07 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA07194; Sat, 17 Jun 2000 12:12:53 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA18824; - Sat, 17 Jun 2000 12:11:18 -0400 (EDT) -To: "Hiroshi Inoue" -cc: "Bruce Momjian" , "Jan Wieck" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: -References: -Comments: In-reply-to "Hiroshi Inoue" - message dated "Sat, 17 Jun 2000 18:38:29 +0900" -Date: Sat, 17 Jun 2000 12:11:18 -0400 -Message-ID: <18821.961258278@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -"Hiroshi Inoue" writes: -> However it seems very natural to take multi_tables_per_file -> smgr into account when we consider TABLESPACE concept. -> Because TABLESPACE is an encapsulation,it should have -> a possibility to handle multi_tables_per_file smgr IMHO. - -OK, I see: you're just saying that the tablespace stuff should be -designed in such a way that it would work with a non-file-per-table -smgr. Agreed, that'd be a good check of a clean design, and someday -we might need it... - - regards, tom lane - -From tgl@sss.pgh.pa.us Sun Jun 18 12:30:59 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA06514 - for ; Sun, 18 Jun 2000 12:30:58 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA04979 for ; Sun, 18 Jun 2000 12:07:44 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA12163; - Sun, 18 Jun 2000 12:06:29 -0400 (EDT) -To: Bruce Momjian -cc: Jan Wieck , Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006181333.JAA01648@candle.pha.pa.us> -References: <200006181333.JAA01648@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Sun, 18 Jun 2000 09:33:44 -0400" -Date: Sun, 18 Jun 2000 12:06:29 -0400 -Message-ID: <12160.961344389@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Bruce Momjian writes: -> ... We could even get fancy and -> round-robin through all the extents directories, looping around to the -> beginning when we run out of them. That sounds nice. - -That sounds horrible. There's no way to tell which extent directory -extent N goes into except by scanning the location directory to find -out how many extent subdirectories there are (so that you can compute -N modulo number-of-directories). Do you want to pay that price on every -file open? - -Worse, what happens when you add another extent directory? You can't -find your old extents anymore, that's what, because they're not in the -right place (N modulo number-of-directories just changed). Since the -extents are presumably on different volumes, you're talking about -physical file moves to get them where they should be. You probably -can't add a new extent without shutting down the entire database while -you reshuffle files --- at the very least you'd need to get exclusive -locks on all the tables in that tablespace. - -Also, you'll get filename conflicts from multiple extents of a single -table appearing in one of the recycled extent dirs. You could work -around it by using the non-modulo'd N as part of the final file name, -but that just adds more complexity and makes the filename-generation -machinery that much more closely tied to this specific way of doing -things. - -The right way to do this is that extent N goes into extents subdirectory -N, period. If there's no such subdirectory, create one on-the-fly as a -plain subdirectory of the location directory. The dbadmin can easily -create secondary extent symlinks *in advance of their being needed*. -Reorganizing later is much more painful since it requires moving -physical files, but I think that'd be true no matter what. At least -we should see to it that adding more space in advance of needing it is -painless. - -It's possible to do it that way (auto-create extent subdir if needed) -without tying the md.c machinery real closely to a specific filename -creation procedure: it's just the same sort of thing as install programs -customarily do. "If you fail to create a file, try creating its -ancestor directory." We'd have to think about whether it'd be a good -idea to allow auto-creation of more than one level of directory; offhand -it seems that needing to make more than one level is probably a sign of -an erroneous path, not need for another extent subdirectory. - - regards, tom lane - -From dhogaza@pacifier.com Sun Jun 18 20:01:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA19951 - for ; Sun, 18 Jun 2000 20:00:59 -0400 (EDT) -Received: from smtp.pacifier.com (asteroid.pacifier.com [199.2.117.154]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA24345 for ; Sun, 18 Jun 2000 19:50:06 -0400 (EDT) -Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68]) - by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id QAA05302; - Sun, 18 Jun 2000 16:49:27 -0700 (PDT) -Message-Id: <3.0.1.32.20000618164342.011d2450@mail.pacifier.com> -X-Sender: dhogaza@mail.pacifier.com -X-Mailer: Windows Eudora Pro Version 3.0.1 (32) -Date: Sun, 18 Jun 2000 16:43:42 -0700 -To: Bruce Momjian , Tom Lane -From: Don Baccus -Subject: Re: [HACKERS] Big 7.1 open items -Cc: Jan Wieck , Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -In-Reply-To: <200006182250.SAA13436@candle.pha.pa.us> -References: <12160.961344389@sss.pgh.pa.us> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -Status: ROr - -At 06:50 PM 6/18/00 -0400, Bruce Momjian wrote: ->If we eliminate the round-robin idea, what did people think of the rest ->of the ideas? - -Why invent new syntax when "create tablespace" is something a lot -of folks will recognize? - -And why not use "create table ... using ... "? In other words, -Oracle-compatible for this construct? Sure, Postgres doesn't -have to follow Oraclisms but picking an existing contruct means -at least SOME folks can import a datamodel without having to -edit it. - -Does your proposal break the smgr abstraction, i.e. does it -preclude later efforts to (say) implement an (optional) -raw-device storage manager? - - - - -- Don Baccus, Portland OR - Nature photos, on-line guides, Pacific Northwest - Rare Bird Alert Service and other goodies at - http://donb.photo.net. - -From pgsql-hackers-owner+M3571@hub.org Sun Jun 18 23:28:13 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA23880 - for ; Sun, 18 Jun 2000 23:28:12 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA04627 for ; Sun, 18 Jun 2000 23:24:37 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5J3GQM78526; - Sun, 18 Jun 2000 23:16:26 -0400 (EDT) -Received: from candle.pha.pa.us (pgman@nav-43.dsl.navpoint.com [162.33.245.46]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5J3E3M71538 - for ; Sun, 18 Jun 2000 23:14:03 -0400 (EDT) -Received: (from pgman@localhost) - by candle.pha.pa.us (8.9.0/8.9.0) id XAA23541; - Sun, 18 Jun 2000 23:13:44 -0400 (EDT) -From: Bruce Momjian -Message-Id: <200006190313.XAA23541@candle.pha.pa.us> -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <12160.961344389@sss.pgh.pa.us> "from Tom Lane at Jun 18, 2000 12:06:29 - pm" -To: Tom Lane -Date: Sun, 18 Jun 2000 23:13:44 -0400 (EDT) -CC: Jan Wieck , Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -X-Mailer: ELM [version 2.4ME+ PL77 (25)] -MIME-Version: 1.0 -Content-Transfer-Encoding: 7bit -Content-Type: text/plain; charset=US-ASCII -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -My basic proposal is that we optionally allow symlinks when creating -tablespace directories, and that we interrogate those symlinks during a -dump so administrators can move tablespaces around without having to -modify environment variables or system tables. - -I also suggested creating an extent directory to hold extents, like -extent/2 and extent/3. This will allow administration for smaller sites -to be simpler. - --- - Bruce Momjian | http://www.op.net/~candle - pgman@candle.pha.pa.us | (610) 853-3000 - + If your life is a hard drive, | 830 Blythe Avenue - + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 - -From dhogaza@pacifier.com Mon Jun 19 00:31:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA01941 - for ; Mon, 19 Jun 2000 00:31:00 -0400 (EDT) -Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA06881 for ; Mon, 19 Jun 2000 00:11:39 -0400 (EDT) -Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68]) - by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id VAA29138; - Sun, 18 Jun 2000 21:11:01 -0700 (PDT) -Message-Id: <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com> -X-Sender: dhogaza@mail.pacifier.com -X-Mailer: Windows Eudora Pro Version 3.0.1 (32) -Date: Sun, 18 Jun 2000 21:07:48 -0700 -To: Bruce Momjian , Tom Lane -From: Don Baccus -Subject: Re: [HACKERS] Big 7.1 open items -Cc: Jan Wieck , Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -In-Reply-To: <200006190313.XAA23541@candle.pha.pa.us> -References: <12160.961344389@sss.pgh.pa.us> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -Status: RO - -At 11:13 PM 6/18/00 -0400, Bruce Momjian wrote: ->My basic proposal is that we optionally allow symlinks when creating ->tablespace directories, and that we interrogate those symlinks during a ->dump so administrators can move tablespaces around without having to ->modify environment variables or system tables. - -If they can move them around from within the db, they'll have no need to -move them around from outside the db. - -I don't quite understand your devotion to using filesystem commands -outside the database to do database administration. - - - -- Don Baccus, Portland OR - Nature photos, on-line guides, Pacific Northwest - Rare Bird Alert Service and other goodies at - http://donb.photo.net. - -From pgsql-hackers-owner+M3573@hub.org Mon Jun 19 01:31:02 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA01981 - for ; Mon, 19 Jun 2000 01:31:01 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA09569 for ; Mon, 19 Jun 2000 01:13:53 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5J4T3M86960; - Mon, 19 Jun 2000 00:29:04 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5J4RFM80712 - for ; Mon, 19 Jun 2000 00:27:15 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA09517; - Mon, 19 Jun 2000 00:25:53 -0400 (EDT) -To: Bruce Momjian -cc: Jan Wieck , Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006190313.XAA23541@candle.pha.pa.us> -References: <200006190313.XAA23541@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Sun, 18 Jun 2000 23:13:44 -0400" -Date: Mon, 19 Jun 2000 00:25:52 -0400 -Message-ID: <9514.961388752@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: ROr - -Bruce Momjian writes: -> I also suggested creating an extent directory to hold extents, like -> extent/2 and extent/3. This will allow administration for smaller sites -> to be simpler. - -I don't see the value in creating an extra level of directory --- seems -that just adds one more Unix directory-lookup cycle to each file open, -without any apparent return. What's wrong with extent directory names -like extent2, extent3, etc? - -Obviously the extent dirnames must be chosen so they can't conflict -with table filenames, but that's easily done. For example, if table -files are named like 'OID_xxx' then 'extentN' will never conflict. - - regards, tom lane - -From tgl@sss.pgh.pa.us Mon Jun 19 00:30:58 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA01934 - for ; Mon, 19 Jun 2000 00:30:58 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA07814 for ; Mon, 19 Jun 2000 00:29:36 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA09535; - Mon, 19 Jun 2000 00:28:14 -0400 (EDT) -To: Don Baccus -cc: Bruce Momjian , Jan Wieck , - Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com> -References: <12160.961344389@sss.pgh.pa.us> <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com> -Comments: In-reply-to Don Baccus - message dated "Sun, 18 Jun 2000 21:07:48 -0700" -Date: Mon, 19 Jun 2000 00:28:14 -0400 -Message-ID: <9532.961388894@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Don Baccus writes: -> If they can move them around from within the db, they'll have no need to -> move them around from outside the db. -> I don't quite understand your devotion to using filesystem commands -> outside the database to do database administration. - -Being *able* to use filesystem commands to see/fix what's going on is a -good thing, particularly from a development/debugging standpoint. But -I agree we want to have within-the-system admin commands to do the same -things. - - regards, tom lane - -From pgsql-hackers-owner+M3574@hub.org Mon Jun 19 01:31:01 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA01977 - for ; Mon, 19 Jun 2000 01:31:00 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA09374 for ; Mon, 19 Jun 2000 01:07:50 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5J4VkM95901; - Mon, 19 Jun 2000 00:31:46 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5J4TgM89399 - for ; Mon, 19 Jun 2000 00:29:42 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA09535; - Mon, 19 Jun 2000 00:28:14 -0400 (EDT) -To: Don Baccus -cc: Bruce Momjian , Jan Wieck , - Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com> -References: <12160.961344389@sss.pgh.pa.us> <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com> -Comments: In-reply-to Don Baccus - message dated "Sun, 18 Jun 2000 21:07:48 -0700" -Date: Mon, 19 Jun 2000 00:28:14 -0400 -Message-ID: <9532.961388894@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Don Baccus writes: -> If they can move them around from within the db, they'll have no need to -> move them around from outside the db. -> I don't quite understand your devotion to using filesystem commands -> outside the database to do database administration. - -Being *able* to use filesystem commands to see/fix what's going on is a -good thing, particularly from a development/debugging standpoint. But -I agree we want to have within-the-system admin commands to do the same -things. - - regards, tom lane - -From dhogaza@pacifier.com Mon Jun 19 00:58:39 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA00799 - for ; Mon, 19 Jun 2000 00:58:38 -0400 (EDT) -Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA08143 for ; Mon, 19 Jun 2000 00:37:39 -0400 (EDT) -Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68]) - by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id VAA00259; - Sun, 18 Jun 2000 21:36:25 -0700 (PDT) -Message-Id: <3.0.1.32.20000618213319.011d59c0@mail.pacifier.com> -X-Sender: dhogaza@mail.pacifier.com -X-Mailer: Windows Eudora Pro Version 3.0.1 (32) -Date: Sun, 18 Jun 2000 21:33:19 -0700 -To: Tom Lane -From: Don Baccus -Subject: Re: [HACKERS] Big 7.1 open items -Cc: Bruce Momjian , Jan Wieck , - Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -In-Reply-To: <9532.961388894@sss.pgh.pa.us> -References: <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com> - <12160.961344389@sss.pgh.pa.us> - <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -Status: RO - -At 12:28 AM 6/19/00 -0400, Tom Lane wrote: - ->Being *able* to use filesystem commands to see/fix what's going on is a ->good thing, particularly from a development/debugging standpoint. - -Of course it's a crutch for development, but outside of development -circles few users will know how to use the OS in regard to the -database. - -Assuming PG takes off. Of course, if it remains the realm of the -dedicated hard-core hacker, I'm wrong. - -I have nothing against preserving the ability to use filesystem -commands if there's no significant costs inherent with this approach. -I'd view the breaking of smgr abstraction as a significant cost (though -I agree with Ross that it Bruce's proposal shouldn't require that, I -asked my question to flush Bruce out, if you will, because he's -devoted to a particular outside-the-db management model). - -> But ->I agree we want to have within-the-system admin commands to do the same ->things. - -MUST have, I should think. - - - -- Don Baccus, Portland OR - Nature photos, on-line guides, Pacific Northwest - Rare Bird Alert Service and other goodies at - http://donb.photo.net. - -From Inoue@tpf.co.jp Mon Jun 19 12:31:17 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA29988 - for ; Mon, 19 Jun 2000 12:31:16 -0400 (EDT) -Received: from sd.tpf.co.jp (mail.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA21005 for ; Mon, 19 Jun 2000 12:15:22 -0400 (EDT) -Received: from mcadnote1 (ppm127.noc.fukui.nsk.ne.jp [210.161.188.46]) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id BAA09828; Tue, 20 Jun 2000 01:14:19 +0900 -From: "Hiroshi Inoue" -To: "Bruce Momjian" -Cc: "Tom Lane" , "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Don Baccus" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Tue, 20 Jun 2000 01:17:14 +0900 -Message-ID: -MIME-Version: 1.0 -Content-Type: text/plain; - charset="us-ascii" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) -In-Reply-To: <200006191330.JAA16908@candle.pha.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 -Importance: Normal -Status: ROr - -> -----Original Message----- -> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] -> -> The fact is that symlink information is already stored in the file -> system. If we store symlink information in the database too, there -> exists the ability for the two to get out of sync. My point is that I -> think we can _not_ store symlink information in the database, and query -> the file system using lstat when required. -> - -Hmm,this seems pretty confusing to me. -I don't understand the necessity of symlink. -Directory tree,symlink,hard link ... are OS's standard. -But I don't think they are fit for dbms management. - -PostgreSQL is a database system of cource. So -couldn't it handle more flexible structure than OS's -directory tree for itself ? - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - - -From Inoue@tpf.co.jp Tue Jun 20 02:01:04 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24419 - for ; Tue, 20 Jun 2000 02:00:59 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA26090 for ; Tue, 20 Jun 2000 01:51:00 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id OAA10171; Tue, 20 Jun 2000 14:50:03 +0900 -From: "Hiroshi Inoue" -To: "Bruce Momjian" -Cc: "Tom Lane" , "Jan Wieck" , - "Ross J. Reedstrom" , - "Don Baccus" , - "PostgreSQL-development" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Tue, 20 Jun 2000 14:52:17 +0900 -Message-ID: <000001bfda7b$b0dbf160$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -In-Reply-To: <200006191735.NAA03241@candle.pha.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Importance: Normal -Status: ROr - -> -----Original Message----- -> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] -> -> > > -----Original Message----- -> > > From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] -> > > -> > > The fact is that symlink information is already stored in the file -> > > system. If we store symlink information in the database too, there -> > > exists the ability for the two to get out of sync. My point is that I -> > > think we can _not_ store symlink information in the database, -> and query -> > > the file system using lstat when required. -> > > -> > Hmm,this seems pretty confusing to me. -> > I don't understand the necessity of symlink. -> > Directory tree,symlink,hard link ... are OS's standard. -> > But I don't think they are fit for dbms management. -> > -> > PostgreSQL is a database system of cource. So -> > couldn't it handle more flexible structure than OS's -> > directory tree for itself ? -> -> Yes, but is anyone suggesting a solution that does not work with -> symlinks? If not, why not do it that way? -> - -Maybe other solutions have been proposed already because -there have been so many opinions and proposals. - -I've felt TABLE(DATA)SPACE discussion has always been -divergent. IMHO,one of the main cause is that various factors -have been discussed at once. Shouldn't we make step by step -consensus in TABLE(DATA)SPACE discussion ? - -IMHO,the first step is to decide the syntax of CREATE TABLE -command not to define TABLE(DATA)SPACE. - -Comments ? - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - -From tgl@sss.pgh.pa.us Tue Jun 20 10:51:32 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA15181 - for ; Tue, 20 Jun 2000 10:51:31 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id KAA26466 for ; Tue, 20 Jun 2000 10:37:20 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id KAA29689; - Tue, 20 Jun 2000 10:36:04 -0400 (EDT) -To: Bruce Momjian -cc: Hiroshi Inoue , Jan Wieck , - "Ross J. Reedstrom" , - Don Baccus , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006201340.JAA10387@candle.pha.pa.us> -References: <200006201340.JAA10387@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Tue, 20 Jun 2000 09:40:03 -0400" -Date: Tue, 20 Jun 2000 10:36:04 -0400 -Message-ID: <29686.961511764@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Bruce Momjian writes: -> Agreed. Seems we have several issues: - -> filename contents -> tablespace implementation -> tablespace directory layout -> tablespace commands and syntax - -I think we've agreed that the filename must depend on tablespace, -file version, and file segment number in some fashion --- plus -the table name/OID of course. Although there's no real consensus -about exactly how to construct the name, agreeing on the components -is still a positive step. - -A couple of other areas of contention were: - - revising smgr interface to be cleaner - exactly what to store in pg_class - -I don't think there's any quibble about the idea of cleaning up smgr, -but we don't have a complete proposal on the table yet either. - -As for the pg_class issue, I still favor storing - (a) OID of tablespace --- not for file access, but so that - associated tablespace-table entry can be looked up - by tablespace management operations - (b) pathname of file as a column of type "name", including - a %d to be replaced by segment # - -I think Peter was holding out for storing purely numeric tablespace OID -and table version in pg_class and having a hardwired mapping to pathname -somewhere in smgr. However, I think that doing it that way gains only -micro-efficiency compared to passing a "name" around, while using the -name approach buys us flexibility that's needed for at least some of -the variants under discussion. Given that the exact filename contents -are still so contentious, I think it'd be a bad idea to pick an -implementation that doesn't allow some leeway as to what the filename -will be. A name also has the advantage that it is a single item that -can be used to identify the table to smgr, which will help in cleaning -up the smgr interface. - -As for tablespace layout/implementation, the only real proposal I've -heard is that there be a subdirectory of the database directory for each -tablespace, and that that have a subdirectory for each segment (extent) -of its tables --- where any of these subdirectories could be symlinks -off to a different filesystem. Some unhappiness was raised about -depending on symlinks for this function, but I didn't hear one single -concrete reason not to do it, nor an alternative design. Unless someone -comes up with a counterproposal, I think that that's what the actual -access mechanism will look like. We still need to talk about what we -want to store in the SQL-level representation of a tablespace, and what -sort of tablespace management tools/commands are needed. (Although -"try to make it look like Oracle" seems to be pretty much the consensus -for the command level, not all of us know exactly what that means...) - -Comments? Anything else that we do have consensus on? - - regards, tom lane - -From pgsql-hackers-owner+M3615@hub.org Tue Jun 20 12:55:05 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA25768 - for ; Tue, 20 Jun 2000 12:55:04 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA09949 for ; Tue, 20 Jun 2000 12:41:15 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5KGcCM19112; - Tue, 20 Jun 2000 12:38:12 -0400 (EDT) -Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5KGbbM18701 - for ; Tue, 20 Jun 2000 12:37:37 -0400 (EDT) -Received: from regulus.student.UU.SE ([130.238.5.2]:43625 "EHLO - regulus.its.uu.se") by merganser.its.uu.se with ESMTP - id ; Tue, 20 Jun 2000 18:37:05 +0200 -Received: from peter (helo=localhost) - by regulus.its.uu.se with local-esmtp (Exim 3.02 #2) - id 134R7f-0003wS-00; Tue, 20 Jun 2000 18:43:35 +0200 -Date: Tue, 20 Jun 2000 18:43:35 +0200 (CEST) -From: Peter Eisentraut -To: Bruce Momjian -cc: Jan Wieck , Tom Lane , - Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <200006180316.XAA15410@candle.pha.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=ISO-8859-1 -Content-Transfer-Encoding: 8BIT -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: ROr - -Bruce Momjian writes: - -> If we have a new CREATE DATABASE LOCATION command, we can say: -> -> CREATE DATABASE LOCATION dbloc IN '/var/private/pgsql'; -> CREATE DATABASE newdb IN dbloc; - -We kind of have this already, with CREATE DATABASE foo WITH LOCATION = -'bar'; but of course with environment variable kludgery. But it's a start. - -> mkdir /var/private/pgsql/dbloc -> ln -s /var/private/pgsql/dbloc data/base/dbloc - -I think the problem with this was that you'd have to do an extra lookup -into, say, pg_location to resolve this. Some people are talking about -blind writes, this is not really blind. - -> CREATE LOCATION tabloc IN '/var/private/pgsql'; -> CREATE TABLE newtab ... IN tabloc; - -Okay, so we'd have "table spaces" and "database spaces". Seems like one -"space" ought to be enough. I was thinking that the database "space" would -serve as a default "space" for tables created within it but you could -still create tables in other "spaces" than were the database really is. In -fact, the database wouldn't show up at all in the file names anymore, -which may or may not be a good thing. - -I think Tom suggested something more or less like this: - -$PGDATA/base/tablespace/segment/table - -(leaving the details of "table" aside for now). pg_class would get a -column storing the table space somehow, say an oid reference to -pg_location. There would have to be a default tablespace that's created by -initdb and it's indicated by oid 0. So if you create a simple little table -"foo" it ends up in - -$PGDATA/base/0/0/foo - -That is pretty manageable. Now to create a table space you do - -CREATE LOCATION "name" AT '/some/where'; - -which would make an entry in pg_location and, similar to how you -suggested, create a symlink from - -$PGDATA/base/newoid -> /some/where - -Then when you create a new table at that new location this gets simply -noted in pg_class with an oid reference, the rest works completely -transparently and no lookup outside of pg_class required. The system would -create the segment 0 subdirectory automatically. - -When tables get segmented the system would simply create subdirectories 1, -2, 3, etc. as needed, just as it created the 0 as need, no extra code. - -pg_dump doesn't need to use lstat or whatever at all because the locations -are catalogued. Administrators don't even need to know about the linking -business, they just make sure the target directory exists. - -Two more items to ponder: - -* per-location transaction logs - -* pg_upgrade - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From Inoue@tpf.co.jp Tue Jun 20 17:10:56 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA10307 - for ; Tue, 20 Jun 2000 17:10:55 -0400 (EDT) -Received: from sd.tpf.co.jp (mail.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id QAA08017 for ; Tue, 20 Jun 2000 16:57:44 -0400 (EDT) -Received: from mcadnote1 (ppm127.noc.fukui.nsk.ne.jp [210.161.188.46]) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id FAA00867; Wed, 21 Jun 2000 05:56:44 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" , "Bruce Momjian" -Cc: "Jan Wieck" , "Ross J. Reedstrom" , - "Don Baccus" , - "PostgreSQL-development" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Wed, 21 Jun 2000 05:59:41 +0900 -Message-ID: -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 -In-Reply-To: <29686.961511764@sss.pgh.pa.us> -Importance: Normal -Status: RO - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> Bruce Momjian writes: -> > Agreed. Seems we have several issues: -> -> > filename contents -> > tablespace implementation -> > tablespace directory layout -> > tablespace commands and syntax -> - -[snip] - -> -> Comments? Anything else that we do have consensus on? -> - -Before the details of tablespace implementation, - -1) How to change(extend) the syntax of CREATE TABLE - We only add table(data)space name with some - keyword ? i.e Do we consider tablespace as an - abstraction ? - -To confirm our mutual understanding. - -2) Is tablespace defined per PostgreSQL's database ? -3) Is default tablespace defined per database/user or - for all ? - -AFAIK in Oracle,2) global, 3) per user. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From Inoue@tpf.co.jp Tue Jun 20 20:00:59 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA12668; - Tue, 20 Jun 2000 20:00:58 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA21016; Tue, 20 Jun 2000 19:54:18 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id IAA00974; Wed, 21 Jun 2000 08:52:38 +0900 -From: "Hiroshi Inoue" -To: "Peter Eisentraut" -Cc: "Jan Wieck" , "Tom Lane" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Bruce Momjian" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Wed, 21 Jun 2000 08:54:51 +0900 -Message-ID: <000e01bfdb12$ecc08f00$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -In-Reply-To: -Importance: Normal -Status: ROr - -> -----Original Message----- -> From: Peter Eisentraut -> -> Bruce Momjian writes: -> -> > If we have a new CREATE DATABASE LOCATION command, we can say: -> > -> > CREATE DATABASE LOCATION dbloc IN '/var/private/pgsql'; -> > CREATE DATABASE newdb IN dbloc; -> -> We kind of have this already, with CREATE DATABASE foo WITH LOCATION = -> 'bar'; but of course with environment variable kludgery. But it's a start. -> -> > mkdir /var/private/pgsql/dbloc -> > ln -s /var/private/pgsql/dbloc data/base/dbloc -> -> I think the problem with this was that you'd have to do an extra lookup -> into, say, pg_location to resolve this. Some people are talking about -> blind writes, this is not really blind. -> -> > CREATE LOCATION tabloc IN '/var/private/pgsql'; -> > CREATE TABLE newtab ... IN tabloc; -> -> Okay, so we'd have "table spaces" and "database spaces". Seems like one -> "space" ought to be enough. - -Does your "database space" correspond to current PostgreSQL's database ? -And is it different from SCHEMA ? - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - -From tgl@sss.pgh.pa.us Wed Jun 21 00:23:48 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA18016; - Wed, 21 Jun 2000 00:23:47 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA05207; Wed, 21 Jun 2000 00:07:58 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA03002; - Wed, 21 Jun 2000 00:06:42 -0400 (EDT) -To: Bruce Momjian -cc: Hiroshi Inoue , Peter Eisentraut , - Jan Wieck , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006210345.XAA15107@candle.pha.pa.us> -References: <200006210345.XAA15107@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Tue, 20 Jun 2000 23:45:13 -0400" -Date: Wed, 21 Jun 2000 00:06:42 -0400 -Message-ID: <2999.961560402@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Bruce Momjian writes: -> I recommend making a dbname in each directory, then putting the -> location inside there. - -This still seems backwards to me. Why is it better than tablespace -directory inside database directory? - -One significant problem with it is that there's no longer (AFAICS) -a "default" per-database directory that corresponds to the current -working directory of backends running in that database. Thus, -for example, it's not immediately clear where temporary files and -backend core-dump files will end up. Also, you've just added an -essential extra level (if not two) to the pathnames that backends will -use to address files. - -There is a great deal to be said for - ..../database/tablespace/filename -where .../database/ is the working directory of a backend running in -that database, so that the relative pathname used by that backend to -get to a table is just tablespace/filename. I fail to see any advantage -in reversing the pathname order. If you see one, enlighten me. - - regards, tom lane - -From pgsql-hackers-owner+M3635@hub.org Wed Jun 21 01:00:59 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA19614 - for ; Wed, 21 Jun 2000 01:00:54 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5L4wA125142; - Wed, 21 Jun 2000 00:58:10 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5L4vp125043 - for ; Wed, 21 Jun 2000 00:57:51 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id NAA01462; Wed, 21 Jun 2000 13:52:47 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" , "Bruce Momjian" -Cc: "Peter Eisentraut" , "Jan Wieck" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Wed, 21 Jun 2000 13:55:01 +0900 -Message-ID: <000001bfdb3c$db728760$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -In-reply-to: <2999.961560402@sss.pgh.pa.us> -Importance: Normal -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> Bruce Momjian writes: -> > I recommend making a dbname in each directory, then putting the -> > location inside there. -> -> This still seems backwards to me. Why is it better than tablespace -> directory inside database directory? -> -> One significant problem with it is that there's no longer (AFAICS) -> a "default" per-database directory that corresponds to the current -> working directory of backends running in that database. Thus, -> for example, it's not immediately clear where temporary files and -> backend core-dump files will end up. Also, you've just added an -> essential extra level (if not two) to the pathnames that backends will -> use to address files. -> -> There is a great deal to be said for -> ..../database/tablespace/filename - -OK,I seem to have gotten the answer for the question - Is tablespace defined per PostgreSQL's database ? - -You and Bruce - 1) tablespace is per database -Peter seems to have the following idea(?? not sure) - 2) database = tablespace -My opinion - 3) database and tablespace are relatively irrelevant. - I assume PostgreSQL's database would correspond - to the concept of SCHEMA. - -It seems we are different from the first. -Shoudln't we reach an agreement on it in the first place ? - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - -From pgsql-hackers-owner+M3636@hub.org Wed Jun 21 01:31:12 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20523 - for ; Wed, 21 Jun 2000 01:31:12 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA08982 for ; Wed, 21 Jun 2000 01:15:17 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5L5Bp151546; - Wed, 21 Jun 2000 01:11:51 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5L5BP151324 - for ; Wed, 21 Jun 2000 01:11:25 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id BAA03463; - Wed, 21 Jun 2000 01:09:52 -0400 (EDT) -To: Chris Bitmead -cc: Bruce Momjian , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <3950484D.417C87E9@nimrod.itg.telecom.com.au> -References: <200006210346.XAA15138@candle.pha.pa.us> <3950484D.417C87E9@nimrod.itg.telecom.com.au> -Comments: In-reply-to Chris Bitmead - message dated "Wed, 21 Jun 2000 14:45:01 +1000" -Date: Wed, 21 Jun 2000 01:09:52 -0400 -Message-ID: <3459.961564192@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Chris Bitmead writes: -> What I meant is, would you still be able to create tablespaces on -> systems without symlinks? That would seem to be a desirable feature. - -All else being equal, it'd be nice. Since all else is not equal, -exactly how much sweat are we willing to expend on supporting that -feature on such systems --- to the exclusion of other features we -might expend the same sweat on, with more widely useful results? - -Bear in mind that everything will still *work* just fine on such a -platform, you just don't have a way to spread the database across -multiple filesystems. That's only an issue if the platform has a -fairly Unixy notion of filesystems ... but no symlinks. - -A few messages back someone was opining that we were wasting our time -thinking about tablespaces at all, because any modern platform can -create disk-spanning filesystems for itself, so applications don't have -to worry. I don't buy that argument in general, but I'm quite willing -to quote it for the *very* few systems that are Unixy enough to run -Postgres in the first place, but not quite Unixy enough to have -symlinks. - -You gotta draw the line somewhere at what you will support, and -this particular line seems to me to be entirely reasonable and -justifiable. YMMV... - - regards, tom lane - -From dhogaza@pacifier.com Wed Jun 21 01:31:03 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20492 - for ; Wed, 21 Jun 2000 01:30:58 -0400 (EDT) -Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA09401 for ; Wed, 21 Jun 2000 01:22:50 -0400 (EDT) -Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68]) - by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id WAA22395; - Tue, 20 Jun 2000 22:21:47 -0700 (PDT) -Message-Id: <3.0.1.32.20000620221248.0150f610@mail.pacifier.com> -X-Sender: dhogaza@mail.pacifier.com -X-Mailer: Windows Eudora Pro Version 3.0.1 (32) -Date: Tue, 20 Jun 2000 22:12:48 -0700 -To: "Philip J. Warner" , "Hiroshi Inoue" , - "Tom Lane" , - "Bruce Momjian" -From: Don Baccus -Subject: RE: [HACKERS] Big 7.1 open items -Cc: "Jan Wieck" , "Ross J. Reedstrom" , - "PostgreSQL-development" -In-Reply-To: <3.0.5.32.20000621112210.01d97680@mail.rhyme.com.au> -References: - <29686.961511764@sss.pgh.pa.us> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -Status: RO - -At 11:22 AM 6/21/00 +1000, Philip J. Warner wrote: - ->It may be worth considering leaving the CREATE TABLE statement alone. ->Dec/RDB uses a new statement entirely to define where a table goes... - -It's worth considering, but on the other hand Oracle users greatly -outnumber Compaq/RDB users these days... - -If there's no SQL92 guidance for implementing a feature, I'm pretty much in -favor of tracking Oracle, whose SQL dialect is rapidly becoming a -de-facto standard. - -I'm not saying I like the fact, Oracle's a pain in the ass. But when -adopting existing syntax, might as well adopt that of the crushing -borg. - - - -- Don Baccus, Portland OR - Nature photos, on-line guides, Pacific Northwest - Rare Bird Alert Service and other goodies at - http://donb.photo.net. - -From lockhart@alumni.caltech.edu Wed Jun 21 01:31:07 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20508; - Wed, 21 Jun 2000 01:31:06 -0400 (EDT) -Received: from huey.jpl.nasa.gov (huey.jpl.nasa.gov [128.149.68.100]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA09355; Wed, 21 Jun 2000 01:22:03 -0400 (EDT) -Received: from golem.jpl.nasa.gov (hectic-1 [128.149.68.203]) - by huey.jpl.nasa.gov (8.8.8+Sun/8.8.8) with ESMTP id WAA00821; - Tue, 20 Jun 2000 22:18:38 -0700 (PDT) -Received: from alumni.caltech.edu (localhost.localdomain [127.0.0.1]) - by golem.jpl.nasa.gov (Postfix) with ESMTP - id AF4376F51; Wed, 21 Jun 2000 05:19:29 +0000 (UTC) -Sender: lockhart@mythos.jpl.nasa.gov -Message-ID: <39505061.F42334AB@alumni.caltech.edu> -Date: Wed, 21 Jun 2000 05:19:29 +0000 -From: Thomas Lockhart -Organization: Yes -X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.14-15mdksmp i686) -X-Accept-Language: en -MIME-Version: 1.0 -To: Bruce Momjian -Cc: Peter Eisentraut , Jan Wieck , - Tom Lane , Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -References: <200006201753.NAA27293@candle.pha.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: ROr - -> Yes, I didn't like the environment variable stuff. In fact, I would -> like to not mention the symlink location anywhere in the database, so -> it can be changed without changing it in the database. - -Well, as y'all have noticed, I think there are strong reasons to use -environment variables to manage locations, and that symlinks are a -potential portability and robustness problem. - -An additional point which has relevance to this whole discussion: - -In the future we may allow system resource such as tables to carry names -which use multi-byte encodings. afaik these encodings are not allowed to -be used for physical file names, and even if they were the utility of -using standard operating system utilities like ls goes way down. - -istm that from a portability and evolutionary standpoint OID-only file -names (or at least file names *not* based on relation/class names) is a -requirement. - -Comments? - - - Thomas - -From tgl@sss.pgh.pa.us Wed Jun 21 01:31:05 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20503 - for ; Wed, 21 Jun 2000 01:31:05 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA09513 for ; Wed, 21 Jun 2000 01:25:18 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id BAA03557; - Wed, 21 Jun 2000 01:23:58 -0400 (EDT) -To: "Hiroshi Inoue" -cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <000001bfdb3c$db728760$2801007e@tpf.co.jp> -References: <000001bfdb3c$db728760$2801007e@tpf.co.jp> -Comments: In-reply-to "Hiroshi Inoue" - message dated "Wed, 21 Jun 2000 13:55:01 +0900" -Date: Wed, 21 Jun 2000 01:23:57 -0400 -Message-ID: <3554.961565037@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -"Hiroshi Inoue" writes: ->> There is a great deal to be said for ->> ..../database/tablespace/filename - -> OK,I seem to have gotten the answer for the question -> Is tablespace defined per PostgreSQL's database ? - -Not necessarily --- the tablespace subdirectories could be symlinks -pointing to the same place (assuming you use OIDs or something to keep -the table filenames unique even across databases). This is just an -implementation mechanism; it doesn't foreclose the policy decision -whether tablespaces are database-local or installation-wide. - -(OTOH, pathnames like tablespace/database would pretty much force -tablespaces to be installation-wide whether you wanted it that way -or not.) - -> My opinion -> 3) database and tablespace are relatively irrelevant. -> I assume PostgreSQL's database would correspond -> to the concept of SCHEMA. - -My inclindation is that tablespaces should be installation-wide, but -I'm not completely sold on it. In any case I could see wanting a -permissions mechanism that would only allow some databases to have -tables in a particular tablespace. - -We do need to think more about how traditional Postgres databases -fit together with SCHEMA. Maybe we wouldn't even need multiple -databases per installation if we had SCHEMA done right. - - regards, tom lane - -From pgsql-hackers-owner+M3641@hub.org Wed Jun 21 02:31:02 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA25698 - for ; Wed, 21 Jun 2000 02:31:00 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id CAA11423 for ; Wed, 21 Jun 2000 02:09:13 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5L5we151226; - Wed, 21 Jun 2000 01:58:40 -0400 (EDT) -Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5L5wE151030 - for ; Wed, 21 Jun 2000 01:58:14 -0400 (EDT) -Received: by rice.edu - via sendmail from stdin - id (Debian Smail3.2.0.102) - for pgsql-hackers@postgresql.org; Wed, 21 Jun 2000 00:45:02 -0500 (CDT) -Date: Wed, 21 Jun 2000 00:45:02 -0500 -From: "Ross J. Reedstrom" -To: Tom Lane -Cc: Hiroshi Inoue , Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -Message-ID: <20000621004502.A24387@rice.edu> -Mail-Followup-To: Tom Lane , - Hiroshi Inoue , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development -References: <000001bfdb3c$db728760$2801007e@tpf.co.jp> <3554.961565037@sss.pgh.pa.us> -Mime-Version: 1.0 -Content-Type: text/plain; charset=us-ascii -User-Agent: Mutt/1.0i -In-Reply-To: <3554.961565037@sss.pgh.pa.us>; from tgl@sss.pgh.pa.us on Wed, Jun 21, 2000 at 01:23:57AM -0400 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: ROr - -On Wed, Jun 21, 2000 at 01:23:57AM -0400, Tom Lane wrote: -> "Hiroshi Inoue" writes: -> -> > My opinion -> > 3) database and tablespace are relatively irrelevant. -> > I assume PostgreSQL's database would correspond -> > to the concept of SCHEMA. -> -> My inclindation is that tablespaces should be installation-wide, but -> I'm not completely sold on it. In any case I could see wanting a -> permissions mechanism that would only allow some databases to have -> tables in a particular tablespace. -> -> We do need to think more about how traditional Postgres databases -> fit together with SCHEMA. Maybe we wouldn't even need multiple -> databases per installation if we had SCHEMA done right. -> - -The important point I think is that tablespaces are about physical -storage/namespace, and SCHEMA are about logical namespace: it would make -sense for tables from multiple schema to live in the same tablespace, -as well as tables from one schema to be stored in multiple tablespaces. - -Ross --- -Ross J. Reedstrom, Ph.D., -NSBRI Research Scientist/Programmer -Computer and Information Technology Institute -Rice University, 6100 S. Main St., Houston, TX 77005 - -From pgsql-hackers-owner+M3644@hub.org Wed Jun 21 02:31:03 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA25704 - for ; Wed, 21 Jun 2000 02:31:02 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id CAA11923 for ; Wed, 21 Jun 2000 02:22:41 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5L6JO196109; - Wed, 21 Jun 2000 02:19:24 -0400 (EDT) -Received: from mailo.vtcif.telstra.com.au (mailo.vtcif.telstra.com.au [202.12.144.17]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5L6JB196028 - for ; Wed, 21 Jun 2000 02:19:11 -0400 (EDT) -Received: (from uucp@localhost) by mailo.vtcif.telstra.com.au (8.8.2/8.6.9) id QAA21128 for ; Wed, 21 Jun 2000 16:19:04 +1000 (EST) -Received: from maili.vtcif.telstra.com.au(202.12.142.17) - via SMTP by mailo.vtcif.telstra.com.au, id smtpd08EKgu; Wed Jun 21 16:17:56 2000 -Received: (from uucp@localhost) by maili.vtcif.telstra.com.au (8.8.2/8.6.9) id QAA02825 for ; Wed, 21 Jun 2000 16:17:55 +1000 (EST) -Received: from localhost(127.0.0.1), claiming to be "mail.cdn.telstra.com.au" - via SMTP by localhost, id smtpdnjRBD_; Wed Jun 21 16:17:14 2000 -Received: from lunitari.nimrod.itg.telecom.com.au (lunitari.nimrod.itg.telecom.com.au [192.53.254.48]) by mail.cdn.telstra.com.au (8.8.2/8.6.9) with ESMTP id QAA07553 for ; Wed, 21 Jun 2000 16:17:14 +1000 (EST) -Received: from nimrod.itg.telecom.com.au (majere [192.53.254.45]) - by lunitari.nimrod.itg.telecom.com.au (8.9.1/8.9.3) with ESMTP id QAA05880 - for ; Wed, 21 Jun 2000 16:15:56 +1000 (EST) -Message-ID: <39505D1B.DA335CD2@nimrod.itg.telecom.com.au> -Date: Wed, 21 Jun 2000 16:13:47 +1000 -From: Chris Bitmead -Organization: IBM Global Services -X-Mailer: Mozilla 4.6 [en] (X11; I; SunOS 5.6 sun4u) -X-Accept-Language: en -MIME-Version: 1.0 -To: PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -References: <000001bfdb3c$db728760$2801007e@tpf.co.jp> <3554.961565037@sss.pgh.pa.us> <20000621004502.A24387@rice.edu> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -"Ross J. Reedstrom" wrote: - -> The important point I think is that tablespaces are about physical -> storage/namespace, and SCHEMA are about logical namespace: it would make -> sense for tables from multiple schema to live in the same tablespace, -> as well as tables from one schema to be stored in multiple tablespaces. - -If we accept that argument (which sounds good) then wouldn't we have... - -data/base/db1/table1 -> ../../../tablespace/ts1/db1.table1 -data/base/db1/table2 -> ../../../tablespace/ts1/db1.table2 -data/tablespace/ts1/db1.table1 -data/tablespace/ts1/db1.table2 - -In other words there is a directory for databases, and a directory for -tablespaces. Database tables are symlinked to the appropriate -tablespace. So there is multiple databases per tablespace and multiple -tablespaces per database. - -From pgsql-hackers-owner+M3648@hub.org Wed Jun 21 09:01:01 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA06055 - for ; Wed, 21 Jun 2000 09:01:00 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id IAA29647 for ; Wed, 21 Jun 2000 08:52:25 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5LCo0112103; - Wed, 21 Jun 2000 08:50:00 -0400 (EDT) -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5LCnS112011 - for ; Wed, 21 Jun 2000 08:49:28 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id OAA27330; - Wed, 21 Jun 2000 14:48:44 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Wed, 21 Jun 2000 14:48:44 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA5983@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Hiroshi Inoue'" -Cc: "'pgsql-hackers@postgresql.org'" -Subject: AW: [HACKERS] Big 7.1 open items -Date: Wed, 21 Jun 2000 14:48:43 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - - -> > > CREATE LOCATION tabloc IN '/var/private/pgsql'; -> > > CREATE TABLE newtab ... IN tabloc; -> > -> > Okay, so we'd have "table spaces" and "database spaces". -> Seems like one -> > "space" ought to be enough. - -Yes, one space should be enough. - -> -> Does your "database space" correspond to current PostgreSQL's -> database ? - -I think we should think of the "database space" as the default "table space" -for this database. - -> And is it different from SCHEMA ? - -Please don't mix schema and database, they are two different issues. -Even Oracle has a database, only in Oracle you are limited to one database -per instance. We do not want to add this limitation to PostgreSQL. - -Andreas - -From e99re41@DoCS.UU.SE Wed Jun 21 10:01:10 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA06585; - Wed, 21 Jun 2000 10:01:09 -0400 (EDT) -Received: from meryl.it.uu.se (root@meryl.it.uu.se [130.238.12.42]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id JAA03592; Wed, 21 Jun 2000 09:38:34 -0400 (EDT) -Received: from Ulv.DoCS.UU.SE (e99re41@Ulv.DoCS.UU.SE [130.238.9.167]) - by meryl.it.uu.se (8.8.5/8.8.5) with ESMTP id PAA20520; - Wed, 21 Jun 2000 15:34:34 +0200 (MET DST) -Received: from localhost (e99re41@localhost) by Ulv.DoCS.UU.SE (8.6.12/8.6.12) with ESMTP id PAA10847; Wed, 21 Jun 2000 15:34:27 +0200 -X-Authentication-Warning: Ulv.DoCS.UU.SE: e99re41 owned process doing -bs -Date: Wed, 21 Jun 2000 15:34:27 +0200 (MET DST) -From: Peter Eisentraut -Reply-To: Peter Eisentraut -To: Hiroshi Inoue -cc: Tom Lane , Bruce Momjian , - Jan Wieck , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -In-Reply-To: <000001bfdb3c$db728760$2801007e@tpf.co.jp> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=iso-8859-1 -Content-Transfer-Encoding: 8bit -X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by candle.pha.pa.us id KAA06585 -Status: RO - -On Wed, 21 Jun 2000, Hiroshi Inoue wrote: - -> Peter seems to have the following idea(?? not sure) -> 2) database = tablespace - -No, I thought that a database would have a table space assigned that would -serve as the default for newly created tables, but could be overridden. So -you could group databases onto disks as you want, but a couple of -particularly big/important/unimportant/etc tables from each database could -be put on a different disk. At least this seems to be the most flexible -and conceptually simple solution. - -Ideally, directories per database would go away, but then we'd have the -system tables colliding, since those have the same oid in each database. -But that's not really important. So essentially you'd have - - $PGDATA/base/tablespacesomething/database/tables - -In the default tablespace, "tablespacesomething" is an ordinary directory, -for other tablespaces it symlinks somewhere else. For those browsing -$PGDATA/base, it all looks the same (unless you have colour ls). For those -browsing the actual storage location it looks like -/var/foo/elsewhere/database/tables. - -I'm sure you can squeeze the extension segments in there, maybe between -tablespace and database. - -What I think Bruce is saying is that there should be both database spaces -and table spaces, I think that's too much. - -> My opinion -> 3) database and tablespace are relatively irrelevant. -> I assume PostgreSQL's database would correspond -> to the concept of SCHEMA. - -A database corresponds to a catalog and a schema corresponds to nothing -yet. - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From e99re41@DoCS.UU.SE Wed Jun 21 10:01:09 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA06582; - Wed, 21 Jun 2000 10:01:08 -0400 (EDT) -Received: from meryl.it.uu.se (root@meryl.it.uu.se [130.238.12.42]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id JAA04510; Wed, 21 Jun 2000 09:43:48 -0400 (EDT) -Received: from Ulv.DoCS.UU.SE (e99re41@Ulv.DoCS.UU.SE [130.238.9.167]) - by meryl.it.uu.se (8.8.5/8.8.5) with ESMTP id PAA20730; - Wed, 21 Jun 2000 15:39:23 +0200 (MET DST) -Received: from localhost (e99re41@localhost) by Ulv.DoCS.UU.SE (8.6.12/8.6.12) with ESMTP id PAA10853; Wed, 21 Jun 2000 15:39:16 +0200 -X-Authentication-Warning: Ulv.DoCS.UU.SE: e99re41 owned process doing -bs -Date: Wed, 21 Jun 2000 15:39:16 +0200 (MET DST) -From: Peter Eisentraut -Reply-To: Peter Eisentraut -To: Bruce Momjian -cc: Jan Wieck , Tom Lane , - Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <200006201753.NAA27293@candle.pha.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=iso-8859-1 -Content-Transfer-Encoding: 8bit -X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by candle.pha.pa.us id KAA06582 -Status: ROr - -On Tue, 20 Jun 2000, Bruce Momjian wrote: - -> What I was suggesting is not to catalog the symlink locations, but to -> use lstat when dumping, so that admins can move files around using -> symlinks and not have to udpate the database. - -That surely wouldn't make those happy that are calling for smgr -abstraction. - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From tgl@sss.pgh.pa.us Wed Jun 21 11:31:09 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA08120; - Wed, 21 Jun 2000 11:31:08 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA13232; Wed, 21 Jun 2000 11:08:38 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA04286; - Wed, 21 Jun 2000 11:07:20 -0400 (EDT) -To: Bruce Momjian -cc: Hiroshi Inoue , Peter Eisentraut , - Jan Wieck , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006210433.AAA18343@candle.pha.pa.us> -References: <200006210433.AAA18343@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 21 Jun 2000 00:33:01 -0400" -Date: Wed, 21 Jun 2000 11:07:20 -0400 -Message-ID: <4283.961600040@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Bruce Momjian writes: -> Yes, agreed. I was thinking this: -> CREATE TABLESPACE loc USING '/var/pgsql' -> does: -> ln -s /var/pgsql/dbname/loc data/base/dbname/loc -> In this way, the database has a view of its main directory, plus a /loc -> subdirectory for the tablespace. In the other location, we have -> /var/pgsql/dbname/loc because this allows different databases to use: -> CREATE TABLESPACE loc USING '/var/pgsql' -> and they do not collide with each other in /var/pgsql. - -But they don't collide anyway, because the dbname is already unique. -Isn't the extra subdirectory a waste? - -Because table files will have installation-wide unique names, there's -no really good reason to have either level of subdirectory; you could -just make - CREATE TABLESPACE loc USING '/var/pgsql' -do - ln -s /var/pgsql data/base/dbname/loc -and it'd still work even if multiple DBs were using the same tablespace. - -However, forcing creation of a subdirectory does give you the chance to -make sure the subdir is owned by postgres and has the right permissions, -so there's something to be said for that. It might be reasonable to do - mkdir /var/pgsql/dbname - chmod 700 /var/pgsql/dbname - ln -s /var/pgsql/dbname data/base/dbname/loc - - regards, tom lane - -From lockhart@alumni.caltech.edu Wed Jun 21 11:31:10 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA08135; - Wed, 21 Jun 2000 11:31:09 -0400 (EDT) -Received: from huey.jpl.nasa.gov (huey.jpl.nasa.gov [128.149.68.100]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA15864; Wed, 21 Jun 2000 11:30:06 -0400 (EDT) -Received: from golem.jpl.nasa.gov (hectic-1 [128.149.68.203]) - by huey.jpl.nasa.gov (8.8.8+Sun/8.8.8) with ESMTP id IAA02881; - Wed, 21 Jun 2000 08:26:40 -0700 (PDT) -Received: from alumni.caltech.edu (localhost.localdomain [127.0.0.1]) - by golem.jpl.nasa.gov (Postfix) with ESMTP - id AB8AE6F51; Wed, 21 Jun 2000 15:27:36 +0000 (UTC) -Sender: lockhart@mythos.jpl.nasa.gov -Message-ID: <3950DEE8.2DB4B401@alumni.caltech.edu> -Date: Wed, 21 Jun 2000 15:27:36 +0000 -From: Thomas Lockhart -Organization: Yes -X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.14-15mdksmp i686) -X-Accept-Language: en -MIME-Version: 1.0 -To: Bruce Momjian -Cc: Peter Eisentraut , Jan Wieck , - Tom Lane , Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -References: <200006211511.LAA07416@candle.pha.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: RO - -> Sorry, disagree. Environment variables are a pain to administer, and -> quite counter-intuitive. - -Well, I guess we disagree. But until we have a complete proposed -solution, we should leave environment variables on the table, since they -*do* allow some decoupling of logical and physical storage, and *do* -give the administrator some control over resources *that the admin would -not otherwise have*. - -> > istm that from a portability and evolutionary standpoint OID-only -> > file names (or at least file names *not* based on relation/class -> > names) is a requirement. -> Maybe a requirement at some point for some installations, but I hope -> not a general requirement. - -If a table name can have characters which are not legal for file names, -then how would you propose to support it? If we are doing a -restructuring of the storage scheme, this should be taken into account. - -lockhart=# create table "one/two" (i int); -ERROR: cannot create one/two - -Why not? It demonstrates an unfortunate linkage between file systems and -database resources. - - - Thomas - -From tgl@sss.pgh.pa.us Wed Jun 21 11:31:18 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA08164; - Wed, 21 Jun 2000 11:31:12 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA15786; Wed, 21 Jun 2000 11:29:30 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA04451; - Wed, 21 Jun 2000 11:28:09 -0400 (EDT) -To: Thomas Lockhart -cc: Bruce Momjian , Peter Eisentraut , - Jan Wieck , Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <39505061.F42334AB@alumni.caltech.edu> -References: <200006201753.NAA27293@candle.pha.pa.us> <39505061.F42334AB@alumni.caltech.edu> -Comments: In-reply-to Thomas Lockhart - message dated "Wed, 21 Jun 2000 05:19:29 -0000" -Date: Wed, 21 Jun 2000 11:28:09 -0400 -Message-ID: <4448.961601289@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Thomas Lockhart writes: -> Well, as y'all have noticed, I think there are strong reasons to use -> environment variables to manage locations, and that symlinks are a -> potential portability and robustness problem. - -Reasons? Evidence? - -> An additional point which has relevance to this whole discussion: -> In the future we may allow system resource such as tables to carry names -> which use multi-byte encodings. afaik these encodings are not allowed to -> be used for physical file names, and even if they were the utility of -> using standard operating system utilities like ls goes way down. - -Good point, although in one sense a string is a string --- as long as -we don't allow embedded nulls in server-side encodings, we could use -anything that Postgres thought was a name in a filename, and the OS -should take it. But if your local ls doesn't show it the way you see -in Postgres, the usefulness of having the tablename in the filename -goes way down. - -> istm that from a portability and evolutionary standpoint OID-only file -> names (or at least file names *not* based on relation/class names) is a -> requirement. - -No argument from me ;-). I've been looking for compromise positions -but I still think that pure numeric filenames are the cleanest solution. - -There's something else that should be taken into account: for WAL, the -log will need to record the table file that each insert/delete/update -operation affects. To do that with the smgr-token-is-a-pathname -approach I was suggesting yesterday, I think you have to record the -database name and pathname in each WAL log entry. That's 64 bytes/log -entry which is a *lot*. If we bit the bullet and restricted ourselves -to numeric filenames then the log would need just four numeric values: - database OID - tablespace OID - relation OID - relation version number -(this set of 4 values would also be an smgr file reference token). -16 bytes/log entry looks much better than 64. - -At the moment I can recall the following opinions: - -Pure OID filenames: Thomas, Tom, Marc, Peter E. - -OID+relname filenames: Bruce - -Vadim was in the pure-OID camp a few months ago, but I won't presume -to list him there now since he hasn't been involved in this most -recent round of discussions. I'm not sure where anyone else stands... -but at least in terms of the core group it's pretty clear where the -majority opinion is. - - regards, tom lane - -From lamar.owen@wgcr.org Wed Jun 21 11:51:39 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA09021; - Wed, 21 Jun 2000 11:51:38 -0400 (EDT) -Received: from www.wgcr.org (IDENT:root@www.wgcr.org [206.74.232.194]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA18613; Wed, 21 Jun 2000 11:51:48 -0400 (EDT) -Received: from wgcr.org ([206.74.232.197]) - by www.wgcr.org (8.9.3/8.9.3/WGCR) with ESMTP id LAA19124; - Wed, 21 Jun 2000 11:48:25 -0400 -Message-ID: <3950E3C3.7322BD70@wgcr.org> -Date: Wed, 21 Jun 2000 11:48:19 -0400 -From: Lamar Owen -X-Mailer: Mozilla 4.61 [en] (Win95; I) -X-Accept-Language: en -MIME-Version: 1.0 -To: Tom Lane -CC: Thomas Lockhart , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - Hiroshi Inoue , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -References: <200006201753.NAA27293@candle.pha.pa.us> <39505061.F42334AB@alumni.caltech.edu> <4448.961601289@sss.pgh.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: ROr - -Tom Lane wrote: - -> Thomas Lockhart writes: -> > Well, as y'all have noticed, I think there are strong reasons to use -> > environment variables to manage locations, and that symlinks are a -> > potential portability and robustness problem. - -> Reasons? Evidence? - -Does Win32 do symlinks these days? I know Win32 does envvars, and Win32 -is currently a supported platform. - -I'm not thrilled with either solution -- envvars have their problems -just as surely as symlinks do. - -> At the moment I can recall the following opinions: - -> Pure OID filenames: Thomas, Tom, Marc, Peter E. - -FWIW, count me here. I have tried administering my system using the -filenames -- and have been bitten. Better admin tools in the PostgreSQL -package beat using standard filesystem tools -- the PostgreSQL tools can -be WAL-aware, transaction-aware, and can provide consistent results. -Filesystem tools never will be able to provide consistent results for a -database system that must remain up 24x7, as many if not most PostgreSQL -installations must. - -> OID+relname filenames: Bruce - -Sorry Bruce -- I understand and am sympathetic to your position, and, at -one time, I agreed with it. But not any more. - --- -Lamar Owen -WGCR Internet Radio -1 Peter 4:11 - -From tgl@sss.pgh.pa.us Wed Jun 21 12:10:06 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA09885 - for ; Wed, 21 Jun 2000 12:10:04 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA04789; - Wed, 21 Jun 2000 12:10:15 -0400 (EDT) -To: Bruce Momjian -cc: Hiroshi Inoue , Peter Eisentraut , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006211545.LAA08773@candle.pha.pa.us> -References: <200006211545.LAA08773@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 21 Jun 2000 11:45:12 -0400" -Date: Wed, 21 Jun 2000 12:10:15 -0400 -Message-ID: <4786.961603815@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Bruce Momjian writes: -> Yes, that is true. My idea is that they may want to create loc1 and -> loc2 which initially point to the same location, but later may be moved. -> For example, one tablespace for tables, another for indexes. They may -> initially point to the same directory, but later be split. - -Well, that opens up a completely different issue, which is what about -moving tables from one tablespace to another? - -I think the way you appear to be implying above (shut down the server -so that you can rearrange subdirectories by hand) is the wrong way to -go about it. For one thing, lots of people don't want to shut down -their servers completely for that long, but it's difficult to avoid -doing so if you want to move files by filesystem commands. For another -thing, the above approach requires guessing in advance --- maybe long -in advance --- how you are going to want to repartition your database -when it gets too big for your existing storage. - -The right way to address this problem is to invent a "move table to -new tablespace" command. This'd be pretty trivial to implement based -on a file-versioning approach: the new version of the pg_class tuple -has a new tablespace identifier in it. - - regards, tom lane - -From pgsql-hackers-owner+M3670@hub.org Wed Jun 21 12:30:42 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA10371 - for ; Wed, 21 Jun 2000 12:30:41 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA22315 for ; Wed, 21 Jun 2000 12:23:18 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5LGJU175424; - Wed, 21 Jun 2000 12:19:30 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5LGJJ175359 - for ; Wed, 21 Jun 2000 12:19:19 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA04878; - Wed, 21 Jun 2000 12:17:38 -0400 (EDT) -To: Bruce Momjian -cc: Lamar Owen , - Thomas Lockhart , - Peter Eisentraut , Jan Wieck , - Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006211603.MAA09414@candle.pha.pa.us> -References: <200006211603.MAA09414@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 21 Jun 2000 12:03:12 -0400" -Date: Wed, 21 Jun 2000 12:17:37 -0400 -Message-ID: <4875.961604257@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Bruce Momjian writes: ->> Sorry Bruce -- I understand and am sympathetic to your position, and, at ->> one time, I agreed with it. But not any more. - -> I thought the most recent proposal was to just throw ~16 chars of the -> file name on the end of the file name, and that should not be used for -> anything except visibility. WAL would not need to store that. It could -> just grab the file name that matches the oid/sequence number. - -But that's extra complexity in WAL, plus extra complexity in renaming -tables (if you want the filename to track the logical table name, which -I expect you would), plus extra complexity in smgr and bufmgr and other -places. - -I think people are coming around to the notion that it's better to keep -these low-level operations simple, even if we need to expend more work -on high-level admin tools as a result. - -But we do need to remember to expend that effort on tools! Let's not -drop the ball on that, folks. - - regards, tom lane - -From tgl@sss.pgh.pa.us Wed Jun 21 12:30:40 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA10364 - for ; Wed, 21 Jun 2000 12:30:38 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA22593 for ; Wed, 21 Jun 2000 12:25:58 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA04944; - Wed, 21 Jun 2000 12:24:44 -0400 (EDT) -To: Bruce Momjian -cc: Hiroshi Inoue , Peter Eisentraut , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006211614.MAA09938@candle.pha.pa.us> -References: <200006211614.MAA09938@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 21 Jun 2000 12:14:59 -0400" -Date: Wed, 21 Jun 2000 12:24:44 -0400 -Message-ID: <4941.961604684@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Bruce Momjian writes: ->> Well, that opens up a completely different issue, which is what about ->> moving tables from one tablespace to another? - -> Are you suggesting that doing dbname/locname is somehow harder to do -> that? If you are, I don't understand why. - -It doesn't make it harder, but it still seems pointless to have the -extra directory level. Bear in mind that if we go with all-OID -filenames then you're not going to be looking at "loc1" and "loc2" -anyway, but at "5938171" and "8583727". It's not much of a convenience -to the admin to see that, so we might as well save a level of directory -lookup. - -> The general issue of moving tables between tablespaces can be done from -> in the database. I don't think it is reasonable to shut down the db to -> do that. However, I can see moving tablespaces to different symlinked -> locations may require a shutdown. - -Only if you insist on doing it outside the database using filesystem -tools. Another way is to create a new tablespace in the desired new -location, then move the tables one-by-one to that new tablespace. - -I suppose either one might be preferable depending on your access -patterns --- locking your most critical tables while they're being moved -might be as bad as a total shutdown. - - regards, tom lane - -From tgl@sss.pgh.pa.us Wed Jun 21 13:01:06 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA11366 - for ; Wed, 21 Jun 2000 13:01:05 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA24726 for ; Wed, 21 Jun 2000 12:47:50 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA05112; - Wed, 21 Jun 2000 12:46:34 -0400 (EDT) -To: Bruce Momjian -cc: Hiroshi Inoue , Peter Eisentraut , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006211640.MAA10498@candle.pha.pa.us> -References: <200006211640.MAA10498@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 21 Jun 2000 12:40:35 -0400" -Date: Wed, 21 Jun 2000 12:46:34 -0400 -Message-ID: <5109.961605994@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Bruce Momjian writes: ->>>> Are you suggesting that doing dbname/locname is somehow harder to do ->>>> that? If you are, I don't understand why. ->> ->> It doesn't make it harder, but it still seems pointless to have the ->> extra directory level. Bear in mind that if we go with all-OID ->> filenames then you're not going to be looking at "loc1" and "loc2" ->> anyway, but at "5938171" and "8583727". It's not much of a convenience ->> to the admin to see that, so we might as well save a level of directory ->> lookup. - -> Just seems easier to have stuff segregates into separate per-db -> directories for clarity. Also, as directories get bigger, finding a -> specific file in there becomes harder. Putting 10 databases all in the -> same directory seems bad in this regard. - -Huh? I wasn't arguing against making a db-specific directory below the -tablespace point. I was arguing against making *another* directory -below that one. - -> I don't think we want to be using -> symlinks for tables if we can avoid it. - -Agreed, but where did that come from? None of these proposals mentioned -symlinks for anything but directories, AFAIR. - - regards, tom lane - -From peter@localhost.its.uu.se Wed Jun 21 14:31:13 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA13233 - for ; Wed, 21 Jun 2000 14:31:13 -0400 (EDT) -Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id OAA04201 for ; Wed, 21 Jun 2000 14:11:42 -0400 (EDT) -Received: from regulus.student.UU.SE ([130.238.5.2]:34923 "EHLO - regulus.its.uu.se") by merganser.its.uu.se with ESMTP - id ; Wed, 21 Jun 2000 20:09:46 +0200 -Received: from peter (helo=localhost) - by regulus.its.uu.se with local-esmtp (Exim 3.02 #2) - id 134p2o-0000Uo-00; Wed, 21 Jun 2000 20:16:10 +0200 -Date: Wed, 21 Jun 2000 20:16:10 +0200 (CEST) -From: Peter Eisentraut -To: Tom Lane -cc: Bruce Momjian , Hiroshi Inoue , - Jan Wieck , - "Ross J. Reedstrom" , - Don Baccus , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <29686.961511764@sss.pgh.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=ISO-8859-1 -Content-Transfer-Encoding: 8BIT -Sender: Peter Eisentraut -Status: ROr - -Tom Lane writes: - -> I think Peter was holding out for storing purely numeric tablespace OID -> and table version in pg_class and having a hardwired mapping to pathname -> somewhere in smgr. However, I think that doing it that way gains only -> micro-efficiency compared to passing a "name" around, while using the -> name approach buys us flexibility that's needed for at least some of -> the variants under discussion. - -But that name can only be a dozen or so characters, contain no slash or -other funny characters, etc. That's really poor. Then the alternative is -to have an internal name and an external canonical name. Then you have two -names to worry about. Also consider that when you store both the table -space oid and the internal name in pg_class you create redundant data. -What if you rename the table space? Do you leave the internal name out of -sync? Then what good is the internal name? I'm just concerned that we are -creating at the table space level problems similar to that we're trying to -get rid of at the relation and database level. - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From tgl@sss.pgh.pa.us Wed Jun 21 18:14:19 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA24147 - for ; Wed, 21 Jun 2000 18:14:18 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id RAA24649 for ; Wed, 21 Jun 2000 17:40:59 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA06031; - Wed, 21 Jun 2000 17:39:38 -0400 (EDT) -To: Bruce Momjian -cc: Peter Eisentraut , Hiroshi Inoue , - Jan Wieck , - "Ross J. Reedstrom" , - Don Baccus , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006211842.OAA13514@candle.pha.pa.us> -References: <200006211842.OAA13514@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 21 Jun 2000 14:42:21 -0400" -Date: Wed, 21 Jun 2000 17:39:38 -0400 -Message-ID: <6028.961623578@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Bruce Momjian writes: ->> But that name can only be a dozen or so characters, contain no slash or ->> other funny characters, etc. That's really poor. Then the alternative is ->> to have an internal name and an external canonical name. Then you have two ->> names to worry about. Also consider that when you store both the table ->> space oid and the internal name in pg_class you create redundant data. ->> What if you rename the table space? Do you leave the internal name out of ->> sync? Then what good is the internal name? I'm just concerned that we are ->> creating at the table space level problems similar to that we're trying to ->> get rid of at the relation and database level. - -> Agreed. Having table spaces stored by directories named by oid just -> seems very complicated for no reason. - -Huh? He just gave you two very good reasons: avoid Unix-derived -limitations on the naming of tablespaces (and tables), and avoid -problems with renaming tablespaces. - -I'm pretty much firmly back in the "OID and nothing but" camp. -Or perhaps I should say "OID, file version, and nothing but", -since we still need a version number to do CLUSTER etc. - - regards, tom lane - -From vmikheev@SECTORBASE.COM Wed Jun 21 22:18:38 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07570; - Wed, 21 Jun 2000 22:18:36 -0400 (EDT) -Received: from sectorbase2.sectorbase.com ([208.48.122.131]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA29965; Wed, 21 Jun 2000 19:07:37 -0400 (EDT) -Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21) - id ; Wed, 21 Jun 2000 15:58:30 -0700 -Message-ID: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1> -From: "Mikheev, Vadim" -To: "'Tom Lane'" , - Thomas Lockhart - -Cc: Bruce Momjian , - Peter Eisentraut - , Jan Wieck , - Hiroshi Inoue - , - Bruce Momjian , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Wed, 21 Jun 2000 16:00:17 -0700 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2650.21) -Content-Type: text/plain; - charset="iso-8859-1" -Status: RO - -> If we bit the bullet and restricted ourselves to numeric filenames then -> the log would need just four numeric values: -> database OID -> tablespace OID - -Is someone going to implement it for 7.1? - -> relation OID -> relation version number - -I believe that we can avoid versions using WAL... - -> (this set of 4 values would also be an smgr file reference token). -> 16 bytes/log entry looks much better than 64. -> -> At the moment I can recall the following opinions: -> -> Pure OID filenames: Thomas, Tom, Marc, Peter E. - -+ me. - -But what about LOCATIONs? I object using environment and think that -locations -must be stored in pg_control..? - -Vadim - -From Inoue@tpf.co.jp Wed Jun 21 22:18:39 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07573; - Wed, 21 Jun 2000 22:18:38 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA01857; Wed, 21 Jun 2000 19:37:04 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id IAA02627; Thu, 22 Jun 2000 08:35:27 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" -Cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Thomas Lockhart" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Thu, 22 Jun 2000 08:37:42 +0900 -Message-ID: <000201bfdbd9$b1985580$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -Importance: Normal -In-Reply-To: <4448.961601289@sss.pgh.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Status: RO - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> No argument from me ;-). I've been looking for compromise positions -> but I still think that pure numeric filenames are the cleanest solution. -> -> There's something else that should be taken into account: for WAL, the -> log will need to record the table file that each insert/delete/update -> operation affects. To do that with the smgr-token-is-a-pathname -> approach I was suggesting yesterday, I think you have to record the -> database name and pathname in each WAL log entry. That's 64 bytes/log -> entry which is a *lot*. If we bit the bullet and restricted ourselves -> to numeric filenames then the log would need just four numeric values: -> database OID -> tablespace OID - -I strongly object to keep tablespace OID for smgr file reference token -though we have to keep it for another purpose of cource. I've mentioned -many times tablespace(where to store) info should be distinguished from -*where it is stored* info. Generally tablespace isn't sufficiently -restrictive -for this purpose. e.g. there was an idea about round-robin. e.g. Oracle's -tablespace could have pluaral files... etc. -IMHO,it is misleading to use tablespace OID as (a part of) reference token. - -> relation OID -> relation version number -> (this set of 4 values would also be an smgr file reference token). -> 16 bytes/log entry looks much better than 64. -> - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - -From Inoue@tpf.co.jp Wed Jun 21 22:18:15 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07540; - Wed, 21 Jun 2000 22:18:11 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA04100; Wed, 21 Jun 2000 20:15:09 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id JAA02691; Thu, 22 Jun 2000 09:14:15 +0900 -From: "Hiroshi Inoue" -To: "Mikheev, Vadim" -Cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "'Tom Lane'" , - "Thomas Lockhart" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Thu, 22 Jun 2000 09:16:30 +0900 -Message-ID: <000301bfdbdf$1d0dd920$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -Importance: Normal -In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Status: RO - -> -----Original Message----- -> From: Mikheev, Vadim [mailto:vmikheev@SECTORBASE.COM] -> -> > If we bit the bullet and restricted ourselves to numeric filenames then -> > the log would need just four numeric values: -> > database OID -> > tablespace OID -> -> Is someone going to implement it for 7.1? -> -> > relation OID -> > relation version number -> -> I believe that we can avoid versions using WAL... -> - -How to re-construct tables in place ? -Is the following right ? -1) save the content of current table to somewhere -2) shrink the table and related indexes -3) reload the saved(+some filtering) content - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From Inoue@tpf.co.jp Wed Jun 21 22:18:16 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07553; - Wed, 21 Jun 2000 22:18:15 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA05872; Wed, 21 Jun 2000 20:44:21 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id JAA02750; Thu, 22 Jun 2000 09:43:31 +0900 -From: "Hiroshi Inoue" -To: "Mikheev, Vadim" -Cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "'Tom Lane'" , - "Thomas Lockhart" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Thu, 22 Jun 2000 09:45:46 +0900 -Message-ID: <000401bfdbe3$3420fee0$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -Importance: Normal -In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2C@SECTORBASE1> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Status: RO - -> -----Original Message----- -> From: Mikheev, Vadim [mailto:vmikheev@SECTORBASE.COM] -> -> > > > relation version number -> > > -> > > I believe that we can avoid versions using WAL... -> > > -> > -> > How to re-construct tables in place ? -> > Is the following right ? -> > 1) save the content of current table to somewhere -> > 2) shrink the table and related indexes -> > 3) reload the saved(+some filtering) content -> -> Or - create tmp file and load with new content; log "intent to -> relink table -> file"; -> relink table file; log "file is relinked". -> - -It seems to me that whole content of the table should be -logged before relinking or shrinking. -Is my understanding right ? - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From pgsql-hackers-owner+M3700@hub.org Wed Jun 21 22:17:59 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07504 - for ; Wed, 21 Jun 2000 22:17:58 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id VAA07914 for ; Wed, 21 Jun 2000 21:23:22 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5M1It194420; - Wed, 21 Jun 2000 21:18:55 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5M1Ig194334 - for ; Wed, 21 Jun 2000 21:18:43 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id KAA02808; Thu, 22 Jun 2000 10:12:45 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" -Cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Thomas Lockhart" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Thu, 22 Jun 2000 10:15:01 +0900 -Message-ID: <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -Importance: Normal -In-Reply-To: <4448.961601289@sss.pgh.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> At the moment I can recall the following opinions: -> -> Pure OID filenames: Thomas, Tom, Marc, Peter E. -> -> OID+relname filenames: Bruce -> - -Please add my opinion to the list. - -Unique-id filename: Hiroshi - (Unqiue-id is irrelevant to OID/relname). - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From pgsql-hackers-owner+M3701@hub.org Wed Jun 21 22:18:02 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07513 - for ; Wed, 21 Jun 2000 22:18:01 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id VAA08502 for ; Wed, 21 Jun 2000 21:33:13 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5M1QS107400; - Wed, 21 Jun 2000 21:26:28 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5M1QA107223 - for ; Wed, 21 Jun 2000 21:26:10 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id KAA02831; Thu, 22 Jun 2000 10:25:11 +0900 -From: "Hiroshi Inoue" -To: "Mikheev, Vadim" -Cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "'Tom Lane'" , - "Thomas Lockhart" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Thu, 22 Jun 2000 10:27:26 +0900 -Message-ID: <000601bfdbe9$0658a980$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -Importance: Normal -In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2D@SECTORBASE1> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: Mikheev, Vadim [mailto:vmikheev@SECTORBASE.COM] -> -> > > Or - create tmp file and load with new content; -> > > log "intent to relink table file"; -> > > relink table file; log "file is relinked". -> > -> > It seems to me that whole content of the table should be -> > logged before relinking or shrinking. -> -> Why not just fsync tmp files? -> - -Probably I've misunderstood *relink*. -If *relink* different from *rename* ? - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From vmikheev@SECTORBASE.COM Wed Jun 21 22:17:52 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07492; - Wed, 21 Jun 2000 22:17:51 -0400 (EDT) -Received: from sectorbase2.sectorbase.com ([208.48.122.131]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id VAA08730; Wed, 21 Jun 2000 21:37:44 -0400 (EDT) -Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21) - id ; Wed, 21 Jun 2000 18:28:36 -0700 -Message-ID: <8F4C99C66D04D4118F580090272A7A23018C2F@SECTORBASE1> -From: "Mikheev, Vadim" -To: "'Hiroshi Inoue'" -Cc: Bruce Momjian , - Peter Eisentraut - , Jan Wieck , - Bruce Momjian - , - PostgreSQL-development - , - "Ross J. Reedstrom" , - "'Tom Lane'" , - Thomas Lockhart - -Subject: RE: [HACKERS] Big 7.1 open items -Date: Wed, 21 Jun 2000 18:30:23 -0700 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2650.21) -Content-Type: text/plain; - charset="iso-8859-1" -Status: RO - -> > > > Or - create tmp file and load with new content; -> > > > log "intent to relink table file"; -> > > > relink table file; log "file is relinked". -> > > -> > > It seems to me that whole content of the table should be -> > > logged before relinking or shrinking. -> > -> > Why not just fsync tmp files? -> > -> -> Probably I've misunderstood *relink*. -> If *relink* different from *rename* ? - -I ment something like this - link(table file, tmp2 file); fsync(tmp2 file); -unlink(table file); link(tmp file, table file); fsync(table file); -unlink(tmp file). We can do additional logging (with log flush) of these -steps -if required, postpone on-recovery redo of operations till last relink log -record/ -end of log/transaction abort etc etc etc. - -Vadim - -From Inoue@tpf.co.jp Wed Jun 21 23:22:36 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA10350 - for ; Wed, 21 Jun 2000 23:22:35 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA13743 for ; Wed, 21 Jun 2000 23:07:50 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id MAA03008; Thu, 22 Jun 2000 12:07:00 +0900 -From: "Hiroshi Inoue" -To: "Mikheev, Vadim" -Cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "'Tom Lane'" , - "Thomas Lockhart" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Thu, 22 Jun 2000 12:09:15 +0900 -Message-ID: <000801bfdbf7$3f674200$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -Importance: Normal -In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2F@SECTORBASE1> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Status: RO - -> -----Original Message----- -> From: Mikheev, Vadim [mailto:vmikheev@SECTORBASE.COM] -> -> > > > > Or - create tmp file and load with new content; -> > > > > log "intent to relink table file"; -> > > > > relink table file; log "file is relinked". -> > > > -> > > > It seems to me that whole content of the table should be -> > > > logged before relinking or shrinking. -> > > -> > > Why not just fsync tmp files? -> > > -> > -> > Probably I've misunderstood *relink*. -> > If *relink* different from *rename* ? -> -> I ment something like this - link(table file, tmp2 file); -> fsync(tmp2 file); -> unlink(table file); link(tmp file, table file); fsync(table file); -> unlink(tmp file). - -I see,old file would be rolled back from tmp2 file on abort. -This would work on most platforms. -But cygwin port has a flaw that files could not be unlinked -if they are open. So *relink* may fail in some cases(including -rollback cases). - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From tgl@sss.pgh.pa.us Wed Jun 21 23:22:38 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA10353 - for ; Wed, 21 Jun 2000 23:22:36 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA14206 for ; Wed, 21 Jun 2000 23:16:26 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA07099; - Wed, 21 Jun 2000 23:14:50 -0400 (EDT) -To: "Mikheev, Vadim" -cc: Thomas Lockhart , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1> -References: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1> -Comments: In-reply-to "Mikheev, Vadim" - message dated "Wed, 21 Jun 2000 16:00:17 -0700" -Date: Wed, 21 Jun 2000 23:14:50 -0400 -Message-ID: <7096.961643690@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -"Mikheev, Vadim" writes: ->> relation OID ->> relation version number - -> I believe that we can avoid versions using WAL... - -I don't think so. You're basically saying that - 1. create file 'new' - 2. delete file 'old' - 3. rename 'new' to 'old' -is safe as long as you have a redo log to ensure that the rename -happens even if you crash between steps 2 and 3. But crash is not -the only hazard. What if step 3 just plain fails? Redo won't help. - -I'm having a hard time inventing really plausible examples, but a -slightly implausible example is that someone chmod's the containing -directory -w between steps 2 and 3. (Maybe it's not so implausible -if you assume a crash after step 2 ... someone might have left the -directory nonwritable while restoring the system.) - -If we use file version numbers, then the *only* thing needed to -make a valid transition between one set of files and another is -a commit of the update of pg_class that shows the new version number -in the rel's pg_class tuple. The worst that can happen to you in -a crash or other failure is that you are unable to get rid of the -set of files that you don't want anymore. That might waste disk -space but it doesn't leave the database corrupted. - -> But what about LOCATIONs? I object using environment and think that -> locations must be stored in pg_control..? - -I don't like environment variables for this either; it's just way too -easy to start the postmaster with wrong environment. It still seems -to me that relying on subdirectory symlinks is a good way to go. -pg_control is not so good --- if it gets corrupted, how do you recover? -symlinks can be recreated by hand if necessary, but... - - regards, tom lane - -From pgsql-hackers-owner+M3711@hub.org Thu Jun 22 01:01:06 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA22245 - for ; Thu, 22 Jun 2000 01:01:02 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA18310 for ; Thu, 22 Jun 2000 00:43:00 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5M3US167109; - Wed, 21 Jun 2000 23:30:28 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5M3U0164115 - for ; Wed, 21 Jun 2000 23:30:00 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA07156; - Wed, 21 Jun 2000 23:27:10 -0400 (EDT) -To: "Hiroshi Inoue" -cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Thomas Lockhart" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp> -References: <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp> -Comments: In-reply-to "Hiroshi Inoue" - message dated "Thu, 22 Jun 2000 10:15:01 +0900" -Date: Wed, 21 Jun 2000 23:27:10 -0400 -Message-ID: <7153.961644430@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -"Hiroshi Inoue" writes: -> Please add my opinion to the list. -> Unique-id filename: Hiroshi -> (Unqiue-id is irrelevant to OID/relname). - -"Unique ID" is more or less equivalent to "OID + version number", -right? - -I was trying earlier to convince myself that a single unique-ID value -would be better than OID+version for the smgr interface, because it'd -certainly be easier to pass around. I failed to convince myself though, -and the thing that bothered me was this. Suppose you are trying to -recover a corrupted database manually, and the only information you have -about which table is which is a somewhat out-of-date listing of OIDs -versus table names. (Maybe it's out of date because you got it from -your last backup tape.) If the files are named OID+version you're not -going to have much trouble seeing which is which, even if some of the -versions are higher than what was on the tape. But if version-updated -tables are given entirely new unique IDs, you've got no hope at all of -telling which one corresponds to what you had in the listing. Maybe -you can tell by looking through the physical file contents, but -certainly this way is more fragile from the point of view of data -recovery. - - regards, tom lane - -From tgl@sss.pgh.pa.us Thu Jun 22 01:01:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA22232; - Thu, 22 Jun 2000 01:00:59 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA17842; Thu, 22 Jun 2000 00:31:06 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA07254; - Thu, 22 Jun 2000 00:29:42 -0400 (EDT) -To: "Hiroshi Inoue" -cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "Bruce Momjian" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Thomas Lockhart" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <000201bfdbd9$b1985580$2801007e@tpf.co.jp> -References: <000201bfdbd9$b1985580$2801007e@tpf.co.jp> -Comments: In-reply-to "Hiroshi Inoue" - message dated "Thu, 22 Jun 2000 08:37:42 +0900" -Date: Thu, 22 Jun 2000 00:29:42 -0400 -Message-ID: <7251.961648182@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -"Hiroshi Inoue" writes: -> I strongly object to keep tablespace OID for smgr file reference token -> though we have to keep it for another purpose of cource. I've mentioned -> many times tablespace(where to store) info should be distinguished from -> *where it is stored* info. - -Sure. But this proposal assumes that we're relying on symlinks to -carry the information about physical locations corresponding to -tablespace OIDs. The backend just needs to know enough to access a -relation file at a relative pathname like - tablespaceOID/relationOID -(ignoring version and segment numbers for now). Under the hood, -a symlink for tablespaceOID gets the work done. - -Certainly this is not a perfect mechanism. But it is simple, it -is reliable, it is portable to most of the platforms we care about -(yeah, I know we have a Win port, but you wouldn't ever recommend -someone to run a *serious* database on it would you?), and in general -I think the bang-for-the-buck ratio is enormous. I do not want to -have to deal with explicit tablespace bookkeeping in the backend, -but that seems like what we'd have to do in order to improve on -symlinks. - - regards, tom lane - -From pgsql-hackers-owner+M3720@hub.org Thu Jun 22 02:01:02 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24025 - for ; Thu, 22 Jun 2000 02:01:02 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA21392 for ; Thu, 22 Jun 2000 01:56:49 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5M5jp143149; - Thu, 22 Jun 2000 01:45:51 -0400 (EDT) -Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5M5jT143025 - for ; Thu, 22 Jun 2000 01:45:29 -0400 (EDT) -Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68]) - by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id WAA11735; - Wed, 21 Jun 2000 22:44:28 -0700 (PDT) -Message-Id: <3.0.1.32.20000621224122.035b8c80@mail.pacifier.com> -X-Sender: dhogaza@mail.pacifier.com -X-Mailer: Windows Eudora Pro Version 3.0.1 (32) -Date: Wed, 21 Jun 2000 22:41:22 -0700 -To: Chris Bitmead , - Bruce Momjian -From: Don Baccus -Subject: Re: [HACKERS] Big 7.1 open items -Cc: PostgreSQL-development -In-Reply-To: <39518B7C.F76108FD@nimrod.itg.telecom.com.au> -References: <200006220229.WAA08130@candle.pha.pa.us> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -At 01:43 PM 6/22/00 +1000, Chris Bitmead wrote: - ->I'm wondering if pg_dump should store the location of the tablespace. If ->your machine dies, you get a new machine to re-create the database, you ->may not want the tablespace in the same spot. And text-editing a ->gigabyte file would be extremely painful. - -So you don't dump your create tablespace statements, recognizing that on -a new machine (due to upgrades or crashing) you might assign them to -different directories/mount points/whatever. That's the reason for -wanting to hide physical allocation in tablespaces ... the rest of -your datamodel doesn't need to know. - -Or you do dump your tablespaces, and knowing the paths assigned -to various ones set up your new machine accordingly. - - - -- Don Baccus, Portland OR - Nature photos, on-line guides, Pacific Northwest - Rare Bird Alert Service and other goodies at - http://donb.photo.net. - -From dhogaza@pacifier.com Thu Jun 22 02:00:58 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24005 - for ; Thu, 22 Jun 2000 02:00:58 -0400 (EDT) -Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA21369 for ; Thu, 22 Jun 2000 01:56:18 -0400 (EDT) -Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68]) - by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id WAA12121; - Wed, 21 Jun 2000 22:55:39 -0700 (PDT) -Message-Id: <3.0.1.32.20000621225149.035bc070@mail.pacifier.com> -X-Sender: dhogaza@mail.pacifier.com -X-Mailer: Windows Eudora Pro Version 3.0.1 (32) -Date: Wed, 21 Jun 2000 22:51:49 -0700 -To: Bruce Momjian , - Chris Bitmead -From: Don Baccus -Subject: Re: [HACKERS] Big 7.1 open items -Cc: PostgreSQL-development -In-Reply-To: <200006220403.AAA15648@candle.pha.pa.us> -References: <39518B7C.F76108FD@nimrod.itg.telecom.com.au> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -Status: RO - -At 12:03 AM 6/22/00 -0400, Bruce Momjian wrote: - ->If the symlink create fails in CREATE TABLESPACE, it just creates an ->ordinary directory. - -Silent surprises - the earmark of truly professional software ... - - - -- Don Baccus, Portland OR - Nature photos, on-line guides, Pacific Northwest - Rare Bird Alert Service and other goodies at - http://donb.photo.net. - -From Inoue@tpf.co.jp Thu Jun 22 02:01:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24009 - for ; Thu, 22 Jun 2000 02:00:59 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA21277 for ; Thu, 22 Jun 2000 01:54:44 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id OAA03303; Thu, 22 Jun 2000 14:53:52 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" -Cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Thomas Lockhart" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Thu, 22 Jun 2000 14:56:07 +0900 -Message-ID: <000901bfdc0e$8f32fec0$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -Importance: Normal -In-Reply-To: <7251.961648182@sss.pgh.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Status: RO - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> "Hiroshi Inoue" writes: -> > I strongly object to keep tablespace OID for smgr file reference token -> > though we have to keep it for another purpose of cource. I've mentioned -> > many times tablespace(where to store) info should be distinguished from -> > *where it is stored* info. -> -> Sure. But this proposal assumes that we're relying on symlinks to -> carry the information about physical locations corresponding to -> tablespace OIDs. The backend just needs to know enough to access a -> relation file at a relative pathname like -> tablespaceOID/relationOID -> (ignoring version and segment numbers for now). Under the hood, -> a symlink for tablespaceOID gets the work done. -> - -I think tablespaceOID is an easy substitution for the purpose. -I don't like to depend on poor directory tree structure in dbms -either.. - -> Certainly this is not a perfect mechanism. But it is simple, it -> is reliable, it is portable to most of the platforms we care about -> (yeah, I know we have a Win port, but you wouldn't ever recommend -> someone to run a *serious* database on it would you?), and in general -> I think the bang-for-the-buck ratio is enormous. I do not want to -> have to deal with explicit tablespace bookkeeping in the backend, -> but that seems like what we'd have to do in order to improve on -> symlinks. -> - -I've already mentioned about it 10 times or so but unfortunately -I see no one on my side yet. -OK,I've given up the discussion about it. I don't want to waste -my time any more. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From tgl@sss.pgh.pa.us Thu Jun 22 03:31:04 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA28813 - for ; Thu, 22 Jun 2000 03:31:03 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA23901 for ; Thu, 22 Jun 2000 03:06:47 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA07725; - Thu, 22 Jun 2000 03:05:00 -0400 (EDT) -To: Chris Bitmead -cc: Bruce Momjian , - PostgreSQL-development -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <39518B7C.F76108FD@nimrod.itg.telecom.com.au> -References: <200006220229.WAA08130@candle.pha.pa.us> <39518B7C.F76108FD@nimrod.itg.telecom.com.au> -Comments: In-reply-to Chris Bitmead - message dated "Thu, 22 Jun 2000 13:43:56 +1000" -Date: Thu, 22 Jun 2000 03:05:00 -0400 -Message-ID: <7722.961657500@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Chris Bitmead writes: -> I'm wondering if pg_dump should store the location of the tablespace. If -> your machine dies, you get a new machine to re-create the database, you -> may not want the tablespace in the same spot. And text-editing a -> gigabyte file would be extremely painful. - -Might make sense to store the tablespace setup separately from the bulk -of the data, but certainly you want some way to dump that info in a -restorable form. - -I've been thinking lately that the pg_dump shove-it-all-in-one-file -approach doesn't scale anyway. We ought to start thinking about ways -to make the standard dump method store schema separately from bulk -data, for example. That's offtopic for this thread but ought to be -on the TODO list someplace... - - regards, tom lane - -From pgsql-hackers-owner+M3727@hub.org Thu Jun 22 03:31:06 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA28819 - for ; Thu, 22 Jun 2000 03:31:05 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA24751 for ; Thu, 22 Jun 2000 03:29:00 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5M7KP140211; - Thu, 22 Jun 2000 03:20:25 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5M7Jb139991 - for ; Thu, 22 Jun 2000 03:19:37 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA07785; - Thu, 22 Jun 2000 03:17:45 -0400 (EDT) -To: "Philip J. Warner" -cc: "Hiroshi Inoue" , - "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Thomas Lockhart" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <3.0.5.32.20000622163133.009b1600@mail.rhyme.com.au> -References: <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp> <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp> <3.0.5.32.20000622163133.009b1600@mail.rhyme.com.au> -Comments: In-reply-to "Philip J. Warner" - message dated "Thu, 22 Jun 2000 16:31:33 +1000" -Date: Thu, 22 Jun 2000 03:17:45 -0400 -Message-ID: <7782.961658265@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -"Philip J. Warner" writes: ->> ... the thing that bothered me was this. Suppose you are trying to ->> recover a corrupted database manually, and the only information you have ->> about which table is which is a somewhat out-of-date listing of OIDs ->> versus table names. - -> This worries me a little; in the Dec/RDB world it is a very long time since -> database backups were done by copying the files. There is a database -> backup/restore utility which runs while the database is on-line and makes -> sure a valid snapshot is taken. Backing up storage areas (table spapces) -> can be done separately by the same utility, and again, it records enough -> information to ensure integrity. Maybe the thing to do is write a pg_backup -> utility, which in a first pass could, presumably, be synonymous with pg_dump? - -pg_dump already does the consistent-snapshot trick (it just has to run -inside a single transaction). - -> Am I missing something here? Is there a problem with backing up using -> 'pg_dump | gzip'? - -None, as long as your ambition extends no further than restoring your -data to where it was at your last pg_dump. I was thinking about the -all-too-common-in-the-real-world scenario where you're hoping to recover -some data more recent than your last backup from the fractured shards -of your database... - - regards, tom lane - -From zeugswettera@wien.spardat.at Thu Jun 22 05:01:11 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA29525 - for ; Thu, 22 Jun 2000 05:01:09 -0400 (EDT) -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id EAA27070 for ; Thu, 22 Jun 2000 04:38:32 -0400 (EDT) -Received: from peligor.server.lan.at (peligor.server.lan.at [10.8.32.84]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA23252; - Thu, 22 Jun 2000 10:37:45 +0200 -Received: from zeus (totalctlh1-port029.f000.d0188.sd.spardat.at [10.8.35.226]) - by peligor.server.lan.at (8.9.1/8.9.1) with SMTP id KAA02457; - Thu, 22 Jun 2000 10:41:04 GMT -From: Zeugswetter Andreas SB -To: Chris Bitmead , - Bruce Momjian -Subject: Re: Big 7.1 open items -Date: Thu, 22 Jun 2000 09:49:07 +0200 -X-Mailer: KMail [version 1.0.29.1] -Content-Type: text/plain -Cc: PostgreSQL-development -References: <200006220229.WAA08130@candle.pha.pa.us> <39518B7C.F76108FD@nimrod.itg.telecom.com.au> -In-Reply-To: <39518B7C.F76108FD@nimrod.itg.telecom.com.au> -MIME-Version: 1.0 -Message-Id: <00062210055400.00299@zeus> -Content-Transfer-Encoding: 8bit -Status: RO - - -> > pg_dump would recreate a CREATE TABLESPACE command: -> > -> > printf("CREATE TABLESPACE %s USING %s", loc, symloc); -> > -> > where symloc would be SELECT symloc(loc) and return the value into a -> > variable that is used by pg_dump. The backend would do the lstat() and -> > return the value to the client. -> -> I'm wondering if pg_dump should store the location of the tablespace. If -> your machine dies, you get a new machine to re-create the database, you -> may not want the tablespace in the same spot. And text-editing a -> gigabyte file would be extremely painful. - -Yes, that seems like a valid concern that should be kept in mind. -It should also be possible to restore a pg instance to a different location -on the same machine. -Maybe this could be done by adding a utility that dumps all tablespace -info which could then be altered to desire. - -I still opt for instance-wide tablespaces. People wanting separation can easily -create different tablespaces for each database, but those that only want to -separate data and index need only create two tablespaces. A typical installation would -have 1 to 4 tablespaces (systemtbs, datatbs, indextbs, toasttbs | lobdbs ) - -I would also switch the directory structure between dbname and extent subdir, -because that allows less symlinks/filesystems, and thus less admin. - -thus you would have: - tablespace1/extent1/dbname1 - tablespace1/extent2/dbname1 - tablespace1/extent1/dbname2 - -Andreas - -From pjw@rhyme.com.au Thu Jun 22 04:01:05 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA29060 - for ; Thu, 22 Jun 2000 04:01:03 -0400 (EDT) -Received: from acheron.rime.com.au (root@albatr.lnk.telstra.net [139.130.54.222]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA25604 for ; Thu, 22 Jun 2000 03:50:30 -0400 (EDT) -Received: from oberon (Oberon.rime.com.au [203.8.195.100]) - by acheron.rime.com.au (8.9.3/8.9.3) with SMTP id RAA08811; - Thu, 22 Jun 2000 17:43:22 +1000 -Message-Id: <3.0.5.32.20000622175015.00a10160@mail.rhyme.com.au> -X-Sender: pjw@mail.rhyme.com.au -X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32) -Date: Thu, 22 Jun 2000 17:50:15 +1000 -To: Tom Lane -From: "Philip J. Warner" -Subject: Re: [HACKERS] Big 7.1 open items -Cc: "Hiroshi Inoue" , - "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Thomas Lockhart" -In-Reply-To: <7782.961658265@sss.pgh.pa.us> -References: <3.0.5.32.20000622163133.009b1600@mail.rhyme.com.au> - <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp> - <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp> - <3.0.5.32.20000622163133.009b1600@mail.rhyme.com.au> -Mime-Version: 1.0 -Content-Type: text/plain; charset="us-ascii" -Status: RO - -At 03:17 22/06/00 -0400, Tom Lane wrote: -> ->> This worries me a little; in the Dec/RDB world it is a very long time since ->> database backups were done by copying the files. There is a database ->> backup/restore utility which runs while the database is on-line and makes ->> sure a valid snapshot is taken. Backing up storage areas (table spapces) ->> can be done separately by the same utility, and again, it records enough ->> information to ensure integrity. Maybe the thing to do is write a pg_backup ->> utility, which in a first pass could, presumably, be synonymous with -pg_dump? -> ->pg_dump already does the consistent-snapshot trick (it just has to run ->inside a single transaction). -> ->> Am I missing something here? Is there a problem with backing up using ->> 'pg_dump | gzip'? -> ->None, as long as your ambition extends no further than restoring your ->data to where it was at your last pg_dump. I was thinking about the ->all-too-common-in-the-real-world scenario where you're hoping to recover ->some data more recent than your last backup from the fractured shards ->of your database... -> - -pg_dump is a good basis for any pg_backup utility; perhaps as you indicated -elsewhere, more carefull formatting of the dump files would make -table-based restoration possible. In another response, I also suggested -allowing overrides of placement information in a restore operation- the -simplest approach would be an 'ignore-storage-parameters' flag. Does this -sound reasonable? If so, then discussion of file-id based on OID needs not -be too concerned about how db restoration is done. - - - - - ----------------------------------------------------------------- -Philip Warner | __---_____ -Albatross Consulting Pty. Ltd. |----/ - \ -(A.C.N. 008 659 498) | /(@) ______---_ -Tel: (+61) 0500 83 82 81 | _________ \ -Fax: (+61) 0500 83 82 82 | ___________ | -Http://www.rhyme.com.au | / \| - | --________-- -PGP key available upon request, | / -and from pgp5.ai.mit.edu:11371 |/ - -From pgsql-hackers-owner+M3730@hub.org Thu Jun 22 05:31:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA29741 - for ; Thu, 22 Jun 2000 05:31:00 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id FAA28478 for ; Thu, 22 Jun 2000 05:18:37 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5M96W171286; - Thu, 22 Jun 2000 05:06:32 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5M96A168442 - for ; Thu, 22 Jun 2000 05:06:10 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id SAA03635; Thu, 22 Jun 2000 18:05:02 +0900 -From: "Hiroshi Inoue" -To: "Peter Eisentraut" -Cc: "Tom Lane" , "Bruce Momjian" , - "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Thu, 22 Jun 2000 18:07:18 +0900 -Message-ID: <000c01bfdc29$43f717a0$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -Importance: Normal -In-Reply-To: -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: Peter Eisentraut [mailto:e99re41@DoCS.UU.SE] -> -> > My opinion -> > 3) database and tablespace are relatively irrelevant. -> > I assume PostgreSQL's database would correspond -> > to the concept of SCHEMA. -> -> A database corresponds to a catalog and a schema corresponds to nothing -> yet. -> - -Oh I see your point. However I've thought that current PostgreSQL's -database is an imcomplete SCHEMA and still feel so in reality. -Catalog per database has been nothing but needless for me from -the first. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From Inoue@tpf.co.jp Thu Jun 22 07:31:01 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id HAA07559 - for ; Thu, 22 Jun 2000 07:31:00 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id HAA02741 for ; Thu, 22 Jun 2000 07:08:29 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id UAA03834; Thu, 22 Jun 2000 20:06:51 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" -Cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Thomas Lockhart" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Thu, 22 Jun 2000 20:09:07 +0900 -Message-ID: <000d01bfdc3a$48fb35e0$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -Importance: Normal -In-Reply-To: <7153.961644430@sss.pgh.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Status: RO - -> -----Original Message----- -> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] -> -> "Hiroshi Inoue" writes: -> > Please add my opinion to the list. -> > Unique-id filename: Hiroshi -> > (Unqiue-id is irrelevant to OID/relname). -> -> "Unique ID" is more or less equivalent to "OID + version number", -> right? -> - -Hmm,no one seems to be on my side at this point also. -OK,I change my mind as follows. - - OID except cygwin,unique-id on cygwin - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From tgl@sss.pgh.pa.us Thu Jun 22 11:31:06 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA10544 - for ; Thu, 22 Jun 2000 11:31:05 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA23513 for ; Thu, 22 Jun 2000 11:28:53 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA08851; - Thu, 22 Jun 2000 11:27:30 -0400 (EDT) -To: "Hiroshi Inoue" -cc: "Bruce Momjian" , - "Peter Eisentraut" , "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" , - "Thomas Lockhart" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <000d01bfdc3a$48fb35e0$2801007e@tpf.co.jp> -References: <000d01bfdc3a$48fb35e0$2801007e@tpf.co.jp> -Comments: In-reply-to "Hiroshi Inoue" - message dated "Thu, 22 Jun 2000 20:09:07 +0900" -Date: Thu, 22 Jun 2000 11:27:30 -0400 -Message-ID: <8848.961687650@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -"Hiroshi Inoue" writes: -> OK,I change my mind as follows. -> OID except cygwin,unique-id on cygwin - -We don't really want to do that, do we? That's a huge difference in -behavior to have in just one port --- especially a port that none of -the primary developers use (AFAIK anyway). The cygwin port's normal -state of existence will be "broken", surely, if we go that way. - -Besides which, OID alone doesn't give us a possibility of file -versioning, and as I commented to Vadim I think we will want that, -WAL or no WAL. So it seems to me the two viable choices are -unique-id or OID+version-number. Either way, the file-naming behavior -should be the same across all platforms. - - regards, tom lane - -From vmikheev@SECTORBASE.COM Thu Jun 22 14:31:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA11892 - for ; Thu, 22 Jun 2000 14:30:59 -0400 (EDT) -Received: from sectorbase2.sectorbase.com ([208.48.122.131]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id OAA10107 for ; Thu, 22 Jun 2000 14:17:04 -0400 (EDT) -Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21) - id ; Thu, 22 Jun 2000 11:07:59 -0700 -Message-ID: <8F4C99C66D04D4118F580090272A7A23018C31@SECTORBASE1> -From: "Mikheev, Vadim" -To: "'Tom Lane'" -Cc: Thomas Lockhart , - Bruce Momjian - , - Peter Eisentraut , Jan Wieck - , - Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Thu, 22 Jun 2000 11:09:47 -0700 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2650.21) -Content-Type: text/plain; - charset="iso-8859-1" -Status: RO - -> > I believe that we can avoid versions using WAL... -> -> I don't think so. You're basically saying that -> 1. create file 'new' -> 2. delete file 'old' -> 3. rename 'new' to 'old' -> is safe as long as you have a redo log to ensure that the rename -> happens even if you crash between steps 2 and 3. But crash is not -> the only hazard. What if step 3 just plain fails? Redo won't help. - -Ok, ok. Let's use *unique* file name for each table version. -But after thinking, seems that I agreed with Hiroshi about using -*some unique id* for file names instead of oid+version: we could use -just DB' OID + this unique ID in log records to find table file - just -8 bytes. - -So, add me to Hiroshi' camp... if Hiroshi is ready to implement new file -naming -:) - -> > But what about LOCATIONs? I object using environment and think that -> > locations must be stored in pg_control..? -> -> I don't like environment variables for this either; it's just way too -> easy to start the postmaster with wrong environment. It still seems -> to me that relying on subdirectory symlinks is a good way to go. - -I always thought so. - -> pg_control is not so good --- if it gets corrupted, how do -> you recover? - -Impossible to recover anyway - pg_control keeps last checkpoint pointer, -required for recovery. That's why Oracle recommends (requires?) at least -two copies of control file (and log too). -But what if log gets corrupted? Or file system (lost symlinks etc)? -One will have to use backup... - -Vadim - -From peter@localhost.its.uu.se Thu Jun 22 18:37:35 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA19684 - for ; Thu, 22 Jun 2000 18:37:34 -0400 (EDT) -Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id SAA02841 for ; Thu, 22 Jun 2000 18:31:53 -0400 (EDT) -Received: from regulus.student.UU.SE ([130.238.5.2]:37596 "EHLO - regulus.its.uu.se") by merganser.its.uu.se with ESMTP - id ; Fri, 23 Jun 2000 00:29:48 +0200 -Received: from peter (helo=localhost) - by regulus.its.uu.se with local-esmtp (Exim 3.02 #2) - id 135FaG-00062q-00; Fri, 23 Jun 2000 00:36:28 +0200 -Date: Fri, 23 Jun 2000 00:36:28 +0200 (CEST) -From: Peter Eisentraut -To: Tom Lane -cc: Hiroshi Inoue , Bruce Momjian , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <8803.961687343@sss.pgh.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=ISO-8859-1 -Content-Transfer-Encoding: 8BIT -Sender: Peter Eisentraut -Status: RO - -Tom Lane writes: - -> In my mind the point of the "database" concept is to provide a domain -> within which custom datatypes and functions are available. - -Quoth SQL99: - -"A user-defined type is a schema object" - -"An SQL-invoked routine is an element of an SQL-schema" - -I have yet to see anything in SQL that's a per-catalog object. Some things -are global, like users, but everything else is per-schema. - -The way I see it is that schemas are required to be a logical hierarchy, -whereas implementations may see catalogs as a physical division (as indeed -this implementation does). - -> So I think we will still want "database" = "span of applicability of -> system catalogs" - -Yes, because the system catalogs would live in a schema of their own. - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From ZeugswetterA@wien.spardat.at Mon Jun 26 04:10:01 2000 -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA29267 - for ; Mon, 26 Jun 2000 04:09:59 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA35550; - Mon, 26 Jun 2000 10:09:14 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Mon, 26 Jun 2000 10:09:14 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA598B@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Tom Lane'" , Hiroshi Inoue -Cc: Bruce Momjian , - Peter Eisentraut - , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" , - Thomas Lockhart - -Subject: [HACKERS] File versioning (was: Big 7.1 open items) -Date: Mon, 26 Jun 2000 10:09:13 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -Status: RO - - -> Besides which, OID alone doesn't give us a possibility of file -> versioning, and as I commented to Vadim I think we will want that, -> WAL or no WAL. So it seems to me the two viable choices are -> unique-id or OID+version-number. Either way, the file-naming behavior -> should be the same across all platforms. - -I do not think the only problem of a failing rename of "temp" to "new" -on startup rollforward is issue enough to justify the additional complexity -a version implys. -Why not simply abort startup of postmaster in such an event and let the -dba fix it. There can be no data loss. - -If e.g. the permissions of the directory are insufficient we will want to -abort -startup anyway, no? - -Andreas - -From ZeugswetterA@wien.spardat.at Mon Jun 26 05:32:05 2000 -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA29616 - for ; Mon, 26 Jun 2000 05:32:03 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id LAA27288; - Mon, 26 Jun 2000 11:31:08 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Mon, 26 Jun 2000 11:31:08 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA598F@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Hiroshi Inoue'" , Peter Eisentraut , - Tom Lane -Cc: Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: AW: [HACKERS] Big 7.1 open items -Date: Mon, 26 Jun 2000 11:31:06 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -Status: RO - - -> > > In my mind the point of the "database" concept is to -> provide a domain -> > > within which custom datatypes and functions are available. -> > -> -> AFAIK few users understand it and many users have wondered -> why we couldn't issue cross "database" queries. - -Imho the same issue is access to tables on another machine. -If we "fix" that, access to another db on the same instance is just -a variant of the above. - -> -> > Quoth SQL99: -> > -> > "A user-defined type is a schema object" -> > -> > "An SQL-invoked routine is an element of an SQL-schema" -> > -> > I have yet to see anything in SQL that's a per-catalog -> object. Some things -> > are global, like users, but everything else is per-schema. - -Yes. - -> So why is system catalog needed per "database" ? - -I like to use different databases on a development machine, -because it makes testing easier. The only thing that -needs to be changed is the connect statement. All other statements -including schema qualified tablenames stay exactly the same for -each developer even though each has his own database, -and his own version of functions. -I have yet to see an installation that does'nt have at least one program -that needs access to more than one schema. - -On production machines we (using Informix) use different databases -for different products, because it reduces the possibility of accessing -the wrong tables, since the syntax for accessing tables in other db's -is different (dbname[@instancename]:"owner".tabname in Informix) -The schema does not help us, since most of our programs access -tables from more than one schema. - -And again someone wanting Oracle'ish behavior will only create one -database per instance. - -Andreas - -From pgsql-hackers-owner+M4088@hub.org Mon Jul 3 01:57:49 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA08810 - for ; Mon, 3 Jul 2000 01:57:49 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e635u5S69222; - Mon, 3 Jul 2000 01:56:05 -0400 (EDT) -Received: from po.seiren.co.jp (po.seiren.co.jp [203.138.223.10]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5QA5d124120 - for ; Mon, 26 Jun 2000 06:05:41 -0400 (EDT) -Received: from mcadnote1 ([210.161.188.23]) by po.seiren.co.jp - (post.office MTA v1.9.3 ID# 0100012-16224) with SMTP id AAA59; - Mon, 26 Jun 2000 19:04:51 +0900 -From: "Hiroshi Inoue" -To: "Zeugswetter Andreas SB" , - "Peter Eisentraut" , "Tom Lane" -Cc: "Bruce Momjian" , "Jan Wieck" , - "PostgreSQL-development" , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Mon, 26 Jun 2000 19:08:26 +0900 -Message-ID: -MIME-Version: 1.0 -Content-Type: text/plain; - charset="Windows-1252" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) -Importance: Normal -In-Reply-To: <219F68D65015D011A8E000006F8590C605BA598F@sdexcsrv1.f000.d0188.sd.spardat.at> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: Zeugswetter Andreas SB -> -> > > > In my mind the point of the "database" concept is to -> > provide a domain -> > > > within which custom datatypes and functions are available. -> > > -> > -> > AFAIK few users understand it and many users have wondered -> > why we couldn't issue cross "database" queries. -> -> Imho the same issue is access to tables on another machine. -> If we "fix" that, access to another db on the same instance is just -> a variant of the above. -> - -What is a difference between SCHAMA and your "database" ? -I myself am confused about them. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From ZeugswetterA@wien.spardat.at Mon Jun 26 06:50:26 2000 -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA07354 - for ; Mon, 26 Jun 2000 06:50:24 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id MAA41146; - Mon, 26 Jun 2000 12:50:11 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Mon, 26 Jun 2000 12:50:11 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA5991@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Hiroshi Inoue'" , - Peter Eisentraut - , Tom Lane -Cc: Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: AW: [HACKERS] Big 7.1 open items -Date: Mon, 26 Jun 2000 12:50:10 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="windows-1252" -Status: RO - -Hiroshi Inoue [mailto:Inoue@seiren.co.jp] wrote: -> > > > > In my mind the point of the "database" concept is to -> > > provide a domain -> > > > > within which custom datatypes and functions are available. -> > > > -> > > -> > > AFAIK few users understand it and many users have wondered -> > > why we couldn't issue cross "database" queries. -> > -> > Imho the same issue is access to tables on another machine. -> > If we "fix" that, access to another db on the same instance is just -> > a variant of the above. -> > -> -> What is a difference between SCHAMA and your "database" ? -> I myself am confused about them. - -Think of it as a hierarchy: - instance -> database -> schema -> object - -- "instance" corresponds to one postmaster -- "database" as in current implementation -- "schema" name corresponds to the owner of the object, -only that a corresponding db or os user does not need to exist in -some of the implementations I know. -- "object" is one of table, index, function ... - -The database is what you connect to in your connect statement, -you then see all schemas inside this database only. Access to another -database would need an explicitly created synonym or different syntax. -The default "schema" name is usually the logged in user name -(although I don't like this approach, I like Informix's approach where -the schema need not be specified if tabname is unique (and tabname -is unique per db unless you specify database mode ansi)). -All other schemas have to be explicitly named ("schemaname".tabname). - -Oracle has exactly this layout, only you are restricted to one database -per instance. -(They even have a "create database .." statement, although it is somehow -analogous to our initdb). - -Andreas - -From ZeugswetterA@wien.spardat.at Mon Jun 26 07:51:14 2000 -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id HAA07648 - for ; Mon, 26 Jun 2000 07:51:12 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id NAA40848; - Mon, 26 Jun 2000 13:50:56 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Mon, 26 Jun 2000 13:50:55 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA5993@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Mikheev, Vadim'" , - "'Tom Lane'" - -Cc: Thomas Lockhart , - Bruce Momjian - , - Peter Eisentraut , Jan Wieck - , - Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: AW: [HACKERS] Big 7.1 open items -Date: Mon, 26 Jun 2000 13:50:55 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -Status: RO - -Vadim wrote: -> Impossible to recover anyway - pg_control keeps last -> checkpoint pointer, required for recovery. - -Why not put this info in the tx log itself. - -> That's why Oracle recommends (requires?) at least -> two copies of control file .... - -This is one of the most stupid design issues Oracle has. -I suggest you look at the tx log design of Informix. -(No Informix dba fears to pull the power cord on his servers, -ask the same of an Oracle dba, they even fear -"shutdown immediate" on a heavily used db) - -Andreas - -From ZeugswetterA@wien.spardat.at Mon Jun 26 08:02:07 2000 -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA07760 - for ; Mon, 26 Jun 2000 08:02:05 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id OAA74134; - Mon, 26 Jun 2000 14:01:17 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Mon, 26 Jun 2000 14:01:17 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA5994@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: Zeugswetter Andreas SB , - "'Mikheev, Vadim'" , - "'Tom Lane'" - -Cc: Thomas Lockhart , - Bruce Momjian - , - Peter Eisentraut , Jan Wieck - , - Hiroshi Inoue , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: AW: [HACKERS] Big 7.1 open items -Date: Mon, 26 Jun 2000 14:01:15 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -Status: RO - -I wrote: -> Vadim wrote: -> > Impossible to recover anyway - pg_control keeps last -> > checkpoint pointer, required for recovery. -> -> Why not put this info in the tx log itself. -> -> > That's why Oracle recommends (requires?) at least -> > two copies of control file .... -> -> This is one of the most stupid design issues Oracle has. - -The problem is, that if you want to switch to a no fsync environment, -(here I also mean the tx log) -but the possibility of losing a write is still there, you cannot sync -writes to two or more different files. Only one file, the tx log itself is -allowed -to carry lastminute information. - -Thus you need to txlog changes to pg_control also. - -Andreas - -From tgl@sss.pgh.pa.us Mon Jun 26 10:42:08 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA11148 - for ; Mon, 26 Jun 2000 10:42:06 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id KAA17018; - Mon, 26 Jun 2000 10:42:31 -0400 (EDT) -To: Zeugswetter Andreas SB -cc: Hiroshi Inoue , Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" , - Thomas Lockhart -Subject: Re: [HACKERS] File versioning (was: Big 7.1 open items) -In-reply-to: <219F68D65015D011A8E000006F8590C605BA598B@sdexcsrv1.f000.d0188.sd.spardat.at> -References: <219F68D65015D011A8E000006F8590C605BA598B@sdexcsrv1.f000.d0188.sd.spardat.at> -Comments: In-reply-to Zeugswetter Andreas SB - message dated "Mon, 26 Jun 2000 10:09:13 +0200" -Date: Mon, 26 Jun 2000 10:42:31 -0400 -Message-ID: <17015.962030551@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Zeugswetter Andreas SB writes: -> I do not think the only problem of a failing rename of "temp" to "new" -> on startup rollforward is issue enough to justify the additional complexity -> a version implys. - -If that were the only reason for it then I wouldn't feel it was so -essential. However, it will also let us fix CLUSTER, vacuuming of -indexes, ALTER TABLE DROP COLUMN with physical removal of the column, -etc etc. Making the world safe for rollbackable RENAME/DROP/TRUNCATE -TABLE is just one of the benefits. - -Versioning also eliminates a whole host of problems at the bufmgr/smgr -level that are caused by having to cope with relation files getting -renamed out from under you. We have painfully eliminated some of these -problems over the past couple of years by ad-hoc, ugly techniques like -flushing the buffer cache when doing a rename. But who's to say there -are not more such bugs left? - -In short, I think versioning is far *less* complex, not to mention more -reliable, than the kluges we need to use to work around the lack of it. - - regards, tom lane - -From pgsql-hackers-owner+M3879@hub.org Mon Jun 26 18:30:55 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA02022 - for ; Mon, 26 Jun 2000 18:30:54 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5QMMa123238; - Mon, 26 Jun 2000 18:22:37 -0400 (EDT) -Received: from sectorbase2.sectorbase.com ([208.48.122.131]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5QMMJ123161 - for ; Mon, 26 Jun 2000 18:22:19 -0400 (EDT) -Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21) - id ; Mon, 26 Jun 2000 15:13:48 -0700 -Message-ID: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1> -From: "Mikheev, Vadim" -To: "'Tom Lane'" -Cc: "'Hiroshi Inoue'" , - Thomas Lockhart - , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Mon, 26 Jun 2000 15:15:39 -0700 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2650.21) -Content-Type: text/plain; - charset="iso-8859-1" -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> > Do we need *both* database & tablespace to find table file ?! -> > Imho, database shouldn't be used... -> -> That'd work fine for me, but I think Bruce was arguing for paths that -> included the database name. We'd end up with paths that go something -> like -> ..../data/tablespaces/TABLESPACEOID/RELATIONOID -> (plus some kind of decoration for segment and version), so you'd have -> a hard time telling which files in a tablespace belong to which -> database. Doesn't bother me a whole lot, personally --- if one wants - -We could create /data/databases/DATABASEOID/ and create soft-links to -table-files. This way different tables of the same database could be in -different tablespaces. /data/database path would be used in production -and /data/tablespace path would be used in recovery. - -Vadim - -From vmikheev@SECTORBASE.COM Mon Jun 26 18:21:53 2000 -Received: from sectorbase2.sectorbase.com ([208.48.122.131]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA01888 - for ; Mon, 26 Jun 2000 18:21:52 -0400 (EDT) -Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21) - id ; Mon, 26 Jun 2000 15:13:48 -0700 -Message-ID: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1> -From: "Mikheev, Vadim" -To: "'Tom Lane'" -Cc: "'Hiroshi Inoue'" , - Thomas Lockhart - , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Mon, 26 Jun 2000 15:15:39 -0700 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2650.21) -Content-Type: text/plain; - charset="iso-8859-1" -Status: RO - -> > Do we need *both* database & tablespace to find table file ?! -> > Imho, database shouldn't be used... -> -> That'd work fine for me, but I think Bruce was arguing for paths that -> included the database name. We'd end up with paths that go something -> like -> ..../data/tablespaces/TABLESPACEOID/RELATIONOID -> (plus some kind of decoration for segment and version), so you'd have -> a hard time telling which files in a tablespace belong to which -> database. Doesn't bother me a whole lot, personally --- if one wants - -We could create /data/databases/DATABASEOID/ and create soft-links to -table-files. This way different tables of the same database could be in -different tablespaces. /data/database path would be used in production -and /data/tablespace path would be used in recovery. - -Vadim - -From tgl@sss.pgh.pa.us Mon Jun 26 18:47:54 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA02118 - for ; Mon, 26 Jun 2000 18:47:52 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id SAA19579; - Mon, 26 Jun 2000 18:48:22 -0400 (EDT) -To: "Mikheev, Vadim" -cc: "'Hiroshi Inoue'" , - Thomas Lockhart , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1> -References: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1> -Comments: In-reply-to "Mikheev, Vadim" - message dated "Mon, 26 Jun 2000 15:15:39 -0700" -Date: Mon, 26 Jun 2000 18:48:22 -0400 -Message-ID: <19576.962059702@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -"Mikheev, Vadim" writes: -> We could create /data/databases/DATABASEOID/ and create soft-links to -> table-files. This way different tables of the same database could be in -> different tablespaces. /data/database path would be used in production -> and /data/tablespace path would be used in recovery. - -Why would you want to do it that way? Having a different access path -for recovery than for normal operation strikes me as just asking for -trouble ;-) - -The symlinks wouldn't do any good for what Bruce had in mind anyway -(IIRC, he wanted to get useful per-database numbers from "du"). - - regards, tom lane - -From pgsql-hackers-owner+M3888@hub.org Mon Jun 26 23:37:52 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA04481 - for ; Mon, 26 Jun 2000 23:37:51 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5R1nx169365; - Mon, 26 Jun 2000 21:50:00 -0400 (EDT) -Received: from sectorbase2.sectorbase.com ([208.48.122.131]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5R1mt169094 - for ; Mon, 26 Jun 2000 21:48:55 -0400 (EDT) -Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21) - id ; Mon, 26 Jun 2000 18:40:19 -0700 -Message-ID: <8F4C99C66D04D4118F580090272A7A23018C38@SECTORBASE1> -From: "Mikheev, Vadim" -To: "'Tom Lane'" -Cc: "'Hiroshi Inoue'" , - Thomas Lockhart - , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Mon, 26 Jun 2000 18:42:10 -0700 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2650.21) -Content-Type: text/plain; - charset="iso-8859-1" -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> > We could create /data/databases/DATABASEOID/ and create -> > soft-links to table-files. This way different tables of -> > the same database could be in different tablespaces. -> > /data/database path would be used in production -> > and /data/tablespace path would be used in recovery. -> -> Why would you want to do it that way? Having a different access path -> for recovery than for normal operation strikes me as just asking for -> trouble ;-) - -I just think that *databases* (schemas) must be used for *logical* groupping -of tables, not for *physical* one. "Where to store table" is tablespace' -related kind of things! - -> The symlinks wouldn't do any good for what Bruce had in mind anyway -> (IIRC, he wanted to get useful per-database numbers from "du"). - -Imho, ability to put different tables/indices (of the same database) -to different tablespaces (disks) is much more useful then ability to -use du/ls for administration purposes -:) - -Also, I think that we *must* go away from OS' driven disk space -allocation anyway. Currently, the way we extend table files breaks WAL -rule (nothing must go to disk untill logged). + we have to move tuples -from end of file to top to shrink relation - not perfect way to reuse -empty space. +... +... +... - -Vadim - -From Inoue@tpf.co.jp Tue Jun 27 00:05:13 2000 -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA05264 - for ; Tue, 27 Jun 2000 00:05:11 -0400 (EDT) -Received: from tpf.co.jp ([126.0.1.56] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP - id NAA01123; Tue, 27 Jun 2000 13:04:26 +0900 -Message-ID: <39582880.7565547@tpf.co.jp> -Date: Tue, 27 Jun 2000 13:07:28 +0900 -From: Hiroshi Inoue -X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U) -X-Accept-Language: ja -MIME-Version: 1.0 -To: Tom Lane -CC: "Mikheev, Vadim" , - Thomas Lockhart , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -References: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1> <19576.962059702@sss.pgh.pa.us> -Content-Type: text/plain; charset=iso-2022-jp -Content-Transfer-Encoding: 7bit -Status: ROr - -Tom Lane wrote: - -> -> The symlinks wouldn't do any good for what Bruce had in mind anyway -> (IIRC, he wanted to get useful per-database numbers from "du"). - -Our database design seems to be in the opposite direction -if it is restricted for the convenience of command calls. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - - -From pgsql-hackers-owner+M3892@hub.org Tue Jun 27 00:14:24 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA05478 - for ; Tue, 27 Jun 2000 00:14:23 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5R46J182392; - Tue, 27 Jun 2000 00:06:20 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5R466180629 - for ; Tue, 27 Jun 2000 00:06:06 -0400 (EDT) -Received: from tpf.co.jp ([126.0.1.56] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP - id NAA01123; Tue, 27 Jun 2000 13:04:26 +0900 -Message-ID: <39582880.7565547@tpf.co.jp> -Date: Tue, 27 Jun 2000 13:07:28 +0900 -From: Hiroshi Inoue -X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U) -X-Accept-Language: ja -MIME-Version: 1.0 -To: Tom Lane -CC: "Mikheev, Vadim" , - Thomas Lockhart , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -References: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1> <19576.962059702@sss.pgh.pa.us> -Content-Type: text/plain; charset=iso-2022-jp -Content-Transfer-Encoding: 7bit -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Tom Lane wrote: - -> -> The symlinks wouldn't do any good for what Bruce had in mind anyway -> (IIRC, he wanted to get useful per-database numbers from "du"). - -Our database design seems to be in the opposite direction -if it is restricted for the convenience of command calls. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - - -From pgsql-hackers-owner+M3905@hub.org Tue Jun 27 10:07:49 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA21305 - for ; Tue, 27 Jun 2000 10:07:48 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5RDUh185923; - Tue, 27 Jun 2000 09:30:43 -0400 (EDT) -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5RDTB183147 - for ; Tue, 27 Jun 2000 09:29:12 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id PAA41830; - Tue, 27 Jun 2000 15:27:07 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Tue, 27 Jun 2000 15:27:06 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA5999@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Tom Lane'" , - "Mikheev, Vadim" - -Cc: "'Hiroshi Inoue'" , - Thomas Lockhart - , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: AW: [HACKERS] Big 7.1 open items -Date: Tue, 27 Jun 2000 15:27:03 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - - -> That'd work fine for me, but I think Bruce was arguing for paths that -> included the database name. We'd end up with paths that go something -> like -> ..../data/tablespaces/TABLESPACEOID/RELATIONOID -> (plus some kind of decoration for segment and version), so you'd have -> a hard time telling which files in a tablespace belong to which -> database. - -Well ,as long as we have the file per object layout it probably makes sense -to -have "speaking paths", But I see no real problem with: - -..../data/tablespacename/dbname/RELATIONOID[.dat|.idx] - -RELATIONOID standing for whatever the consensus will be. -I do not really see an argument for using a tablespaceoid instead of -it's [maybe mangled] name. - -Andreas - -From pgsql-hackers-owner+M3912@hub.org Tue Jun 27 10:28:39 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA21468 - for ; Tue, 27 Jun 2000 10:28:38 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5REOa111784; - Tue, 27 Jun 2000 10:24:36 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5REOG109445 - for ; Tue, 27 Jun 2000 10:24:16 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id KAA09575; - Tue, 27 Jun 2000 10:23:48 -0400 (EDT) -To: Zeugswetter Andreas SB -cc: "Mikheev, Vadim" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Bruce Momjian , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: AW: [HACKERS] Big 7.1 open items -In-reply-to: <219F68D65015D011A8E000006F8590C605BA5999@sdexcsrv1.f000.d0188.sd.spardat.at> -References: <219F68D65015D011A8E000006F8590C605BA5999@sdexcsrv1.f000.d0188.sd.spardat.at> -Comments: In-reply-to Zeugswetter Andreas SB - message dated "Tue, 27 Jun 2000 15:27:03 +0200" -Date: Tue, 27 Jun 2000 10:23:48 -0400 -Message-ID: <9572.962115828@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Zeugswetter Andreas SB writes: -> I do not really see an argument for using a tablespaceoid instead of -> it's [maybe mangled] name. - -Eliminating filesystem-based restrictions on names, for one. -For example we'd not have to forbid slashes and (probably) backquotes -in tablespace names if we did this, and we'd not have to worry about -filesystem-induced limits on name lengths. Renaming a tablespace -would also be trivial instead of nigh impossible. - -It might be that using tablespace names as directory names is worth -enough from the admin point of view to make the above restrictions -acceptable. But it's a tradeoff, and not one with an obvious choice -IMHO. - - regards, tom lane - -From vmikheev@SECTORBASE.COM Tue Jun 27 14:01:08 2000 -Received: from sectorbase2.sectorbase.com ([208.48.122.131]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA28715 - for ; Tue, 27 Jun 2000 14:01:07 -0400 (EDT) -Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21) - id ; Tue, 27 Jun 2000 10:53:03 -0700 -Message-ID: <8F4C99C66D04D4118F580090272A7A23018C39@SECTORBASE1> -From: "Mikheev, Vadim" -To: "'Bruce Momjian'" , - Hiroshi Inoue - -Cc: Tom Lane , - Thomas Lockhart - , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development - , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Tue, 27 Jun 2000 10:54:55 -0700 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2650.21) -Content-Type: text/plain; - charset="iso-8859-1" -Status: ROr - -> > > The symlinks wouldn't do any good for what Bruce had in -> > > mind anyway (IIRC, he wanted to get useful per-database -> > > numbers from "du"). -> > -> > Our database design seems to be in the opposite direction -> > if it is restricted for the convenience of command calls. -> -> Well, I don't see any reason not to use tablespace/database -> rather than just tablespace. Seems having fewer files in each directory - -Once again - ability to use different tablespaces (disks) for tables/indices -in the same schema. Schemas must not dictate where to store objects <- -bad design. - -> will be a little faster, and if we can make administration easier, -> why not? - -Because you'll not be able use du/ls once we'll implement new smgr anyway. - -And, btw, - for what are we going implement tablespaces? Just to have -fewer files in each dir ?! - -Vadim - -From pgsql-hackers-owner+M3925@hub.org Tue Jun 27 14:03:35 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA28748 - for ; Tue, 27 Jun 2000 14:03:34 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5RI1h139788; - Tue, 27 Jun 2000 14:01:44 -0400 (EDT) -Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5RI1I138791 - for ; Tue, 27 Jun 2000 14:01:18 -0400 (EDT) -Received: from regulus.student.UU.SE ([130.238.5.2]:59174 "EHLO - regulus.its.uu.se") by merganser.its.uu.se with ESMTP - id ; Tue, 27 Jun 2000 20:00:50 +0200 -Received: from peter (helo=localhost) - by regulus.its.uu.se with local-esmtp (Exim 3.02 #2) - id 136zlm-0003zn-00; Tue, 27 Jun 2000 20:07:34 +0200 -Date: Tue, 27 Jun 2000 20:07:34 +0200 (CEST) -From: Peter Eisentraut -To: "Mikheev, Vadim" -cc: "'Hiroshi Inoue'" , "'Tom Lane'" , - Thomas Lockhart , - Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C35@SECTORBASE1> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=ISO-8859-1 -Content-Transfer-Encoding: 8BIT -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Mikheev, Vadim writes: - -> Do we need *both* database & tablespace to find table file ?! -> Imho, database shouldn't be used... - -Then the system tables from different databases would collide. - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From vmikheev@SECTORBASE.COM Tue Jun 27 15:28:25 2000 -Received: from sectorbase2.sectorbase.com ([208.48.122.131]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA04820 - for ; Tue, 27 Jun 2000 15:28:24 -0400 (EDT) -Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21) - id ; Tue, 27 Jun 2000 12:20:20 -0700 -Message-ID: <8F4C99C66D04D4118F580090272A7A23018C3A@SECTORBASE1> -From: "Mikheev, Vadim" -To: "'Bruce Momjian'" -Cc: Hiroshi Inoue , Tom Lane , - Thomas Lockhart , - Peter Eisentraut - , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Tue, 27 Jun 2000 12:22:13 -0700 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2650.21) -Content-Type: text/plain; - charset="iso-8859-1" -Status: ROr - -> > > Well, I don't see any reason not to use tablespace/database -> > > rather than just tablespace. Seems having fewer files in -> > > each directory -> > -> > Once again - ability to use different tablespaces (disks) -> > for tables/indices in the same schema. Schemas must not dictate -> > where to store objects <- bad design. -> -> I am suggesting this symlink: -> -> ln -s data/base/testdb/myspace /var/myspace/testdb -> -> rather than: -> -> ln -s data/base/testdb/myspace /var/myspace -> -> Tablespaces still sit inside database directories, it is just that it -> points to a subdirectory of myspace, rather than myspace itself. -^^^^^^^^^^^ - -Didn't you mean - -ln -s /var/myspace/testdb data/base/testdb/myspace - -? - -I thought that you don't like symlinks from data/base/... This is -how I understood Tom' words: - -> The symlinks wouldn't do any good for what Bruce had in mind anyway -> (IIRC, he wanted to get useful per-database numbers from "du"). - -Vadim - -From vmikheev@SECTORBASE.COM Tue Jun 27 15:43:31 2000 -Received: from sectorbase2.sectorbase.com ([208.48.122.131]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA05148 - for ; Tue, 27 Jun 2000 15:43:30 -0400 (EDT) -Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21) - id ; Tue, 27 Jun 2000 12:35:41 -0700 -Message-ID: <8F4C99C66D04D4118F580090272A7A23018C3C@SECTORBASE1> -From: "Mikheev, Vadim" -To: "'Bruce Momjian'" -Cc: "'Peter Eisentraut'" , - "'Hiroshi Inoue'" - , - "'Tom Lane'" , - Thomas Lockhart - , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Tue, 27 Jun 2000 12:37:34 -0700 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2650.21) -Content-Type: text/plain; - charset="iso-8859-1" -Status: ROr - -> > > Then the system tables from different databases would collide. -> > -> > Actually, if we're going to use unique-ids for file names -> > then we have to know how to get system file names anyway. -> > Hm, OID+VERSION would make our life easier... Hiroshi? -> -> I assume we were going to have a pg_class.relversion to do that, but - ^^^^^^^^ -PG_CLASS_OID.VERSION_ID... - -Just a clarification -:) - -> that is per-database because pg_class is per-database. - -Vadim - -From vmikheev@SECTORBASE.COM Tue Jun 27 15:48:31 2000 -Received: from sectorbase2.sectorbase.com ([208.48.122.131]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA05452 - for ; Tue, 27 Jun 2000 15:48:30 -0400 (EDT) -Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21) - id ; Tue, 27 Jun 2000 12:40:42 -0700 -Message-ID: <8F4C99C66D04D4118F580090272A7A23018C3D@SECTORBASE1> -From: "Mikheev, Vadim" -To: "'Bruce Momjian'" -Cc: "'Peter Eisentraut'" , - "'Hiroshi Inoue'" - , - "'Tom Lane'" , - Thomas Lockhart - , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: RE: [HACKERS] Big 7.1 open items -Date: Tue, 27 Jun 2000 12:42:35 -0700 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2650.21) -Content-Type: text/plain; - charset="iso-8859-1" -Status: ROr - -> I actually meant I thought we were going to have a pg_class column -> called relversion that held the currently active version for that -> relation. -> -> Yes, the file name will be pg_class_oid.version_id. -> -> Is that OK? - -We recently discussed pure *unique-id* file names... - -Vadim - - -From pgsql-hackers-owner+M3939@hub.org Tue Jun 27 17:03:33 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA08565 - for ; Tue, 27 Jun 2000 17:03:32 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5RL2B155891; - Tue, 27 Jun 2000 17:02:11 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5RL10155419 - for ; Tue, 27 Jun 2000 17:01:00 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA11135; - Tue, 27 Jun 2000 17:00:12 -0400 (EDT) -To: Peter Eisentraut -cc: "Mikheev, Vadim" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: -References: -Comments: In-reply-to Peter Eisentraut - message dated "Tue, 27 Jun 2000 20:07:34 +0200" -Date: Tue, 27 Jun 2000 17:00:11 -0400 -Message-ID: <11132.962139611@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Peter Eisentraut writes: -> Mikheev, Vadim writes: ->> Do we need *both* database & tablespace to find table file ?! ->> Imho, database shouldn't be used... - -> Then the system tables from different databases would collide. - -I've been assuming that we would create a separate tablespace for -each database, which would be the location of that database's -system tables. It's probably also the default tablespace for user -tables created in that database, though it wouldn't have to be. - -There should also be a known tablespace for the installation-wide tables -(pg_shadow et al). - -With this approach tablespace+relation would indeed be a sufficient -identifier. We could even eliminate the knowledge that certain -tables are installation-wide from the bufmgr and below (currently -that knowledge is hardwired in places that I'd rather didn't know -about it...) - - regards, tom lane - -From tgl@sss.pgh.pa.us Tue Jun 27 17:00:13 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA08435 - for ; Tue, 27 Jun 2000 17:00:12 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA11135; - Tue, 27 Jun 2000 17:00:12 -0400 (EDT) -To: Peter Eisentraut -cc: "Mikheev, Vadim" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: -References: -Comments: In-reply-to Peter Eisentraut - message dated "Tue, 27 Jun 2000 20:07:34 +0200" -Date: Tue, 27 Jun 2000 17:00:11 -0400 -Message-ID: <11132.962139611@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Peter Eisentraut writes: -> Mikheev, Vadim writes: ->> Do we need *both* database & tablespace to find table file ?! ->> Imho, database shouldn't be used... - -> Then the system tables from different databases would collide. - -I've been assuming that we would create a separate tablespace for -each database, which would be the location of that database's -system tables. It's probably also the default tablespace for user -tables created in that database, though it wouldn't have to be. - -There should also be a known tablespace for the installation-wide tables -(pg_shadow et al). - -With this approach tablespace+relation would indeed be a sufficient -identifier. We could even eliminate the knowledge that certain -tables are installation-wide from the bufmgr and below (currently -that knowledge is hardwired in places that I'd rather didn't know -about it...) - - regards, tom lane - -From tgl@sss.pgh.pa.us Tue Jun 27 17:18:49 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA09638 - for ; Tue, 27 Jun 2000 17:18:48 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA11377; - Tue, 27 Jun 2000 17:19:31 -0400 (EDT) -To: Bruce Momjian -cc: "Mikheev, Vadim" , - "'Peter Eisentraut'" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006271952.PAA05609@candle.pha.pa.us> -References: <200006271952.PAA05609@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Tue, 27 Jun 2000 15:52:40 -0400" -Date: Tue, 27 Jun 2000 17:19:31 -0400 -Message-ID: <11374.962140771@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Bruce Momjian writes: -> Well, that would allow us to mix database files in the same directory, -> if we wanted to do that. My opinion it is better to keep databases in -> separate directories in each tablespace for clarity and performance -> reasons. - -One reason not to do that is that we'd still have to special-case -the system-wide relations. If it's just tablespace and OID in the -path, then the system-wide rels look just the same as any other rel -as far as the low-level stuff is concerned. That would be nice. - -My feeling about the "clarity and performance" issue is that if a -dbadmin wants to keep track of database contents separately, he can -put different databases' tables into different tablespaces to start -with. If he puts several tables into one tablespace, he's saying -he doesn't care about distinguishing their space usage. There's -no reason for us to force an additional level of directory lookup -to be done whether the admin wants it or not. - - regards, tom lane - -From tgl@sss.pgh.pa.us Tue Jun 27 17:29:35 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA09909 - for ; Tue, 27 Jun 2000 17:29:33 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA13026; - Tue, 27 Jun 2000 17:30:18 -0400 (EDT) -To: Bruce Momjian -cc: "Mikheev, Vadim" , - "'Peter Eisentraut'" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006272123.RAA09720@candle.pha.pa.us> -References: <200006272123.RAA09720@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Tue, 27 Jun 2000 17:23:49 -0400" -Date: Tue, 27 Jun 2000 17:30:17 -0400 -Message-ID: <13018.962141417@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Bruce Momjian writes: -> Yes, good point about pg_shadow. They don't have databases. How do we -> get multiple pg_class tables in the same directory? Is the -> pg_class.relversion file a number like 1,2,3,4, or does it come out of -> some global counter like oid. If so, we could put them in the same -> directory. - -I think we could get away with insisting that each database store its -pg_class and friends in a separate tablespace (physically distinct -directory) from any other database. That gets around the OID conflict. - -It's still an open question whether OID+version is better than -unique-ID for naming files that belong to different versions of the -same relation. I can see arguments on both sides. - - regards, tom lane - -From pgsql-hackers-owner+M3944@hub.org Tue Jun 27 17:33:05 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA09986 - for ; Tue, 27 Jun 2000 17:33:04 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5RLV7124097; - Tue, 27 Jun 2000 17:31:07 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5RLUn123949 - for ; Tue, 27 Jun 2000 17:30:49 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA13026; - Tue, 27 Jun 2000 17:30:18 -0400 (EDT) -To: Bruce Momjian -cc: "Mikheev, Vadim" , - "'Peter Eisentraut'" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006272123.RAA09720@candle.pha.pa.us> -References: <200006272123.RAA09720@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Tue, 27 Jun 2000 17:23:49 -0400" -Date: Tue, 27 Jun 2000 17:30:17 -0400 -Message-ID: <13018.962141417@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Bruce Momjian writes: -> Yes, good point about pg_shadow. They don't have databases. How do we -> get multiple pg_class tables in the same directory? Is the -> pg_class.relversion file a number like 1,2,3,4, or does it come out of -> some global counter like oid. If so, we could put them in the same -> directory. - -I think we could get away with insisting that each database store its -pg_class and friends in a separate tablespace (physically distinct -directory) from any other database. That gets around the OID conflict. - -It's still an open question whether OID+version is better than -unique-ID for naming files that belong to different versions of the -same relation. I can see arguments on both sides. - - regards, tom lane - -From Inoue@tpf.co.jp Tue Jun 27 19:13:30 2000 -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA12791 - for ; Tue, 27 Jun 2000 19:13:28 -0400 (EDT) -Received: from tpf.co.jp ([126.0.1.56] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP - id IAA01830; Wed, 28 Jun 2000 08:13:26 +0900 -Message-ID: <395935CB.2CC10452@tpf.co.jp> -Date: Wed, 28 Jun 2000 08:16:27 +0900 -From: Hiroshi Inoue -X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U) -X-Accept-Language: ja -MIME-Version: 1.0 -To: Tom Lane -CC: Bruce Momjian , - "Mikheev, Vadim" , - "'Peter Eisentraut'" , - Thomas Lockhart , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -References: <200006272123.RAA09720@candle.pha.pa.us> <13018.962141417@sss.pgh.pa.us> -Content-Type: text/plain; charset=iso-2022-jp -Content-Transfer-Encoding: 7bit -Status: RO - -Tom Lane wrote: - -> Bruce Momjian writes: -> > Yes, good point about pg_shadow. They don't have databases. How do we -> > get multiple pg_class tables in the same directory? Is the -> > pg_class.relversion file a number like 1,2,3,4, or does it come out of -> > some global counter like oid. If so, we could put them in the same -> > directory. -> -> I think we could get away with insisting that each database store its -> pg_class and friends in a separate tablespace (physically distinct -> directory) from any other database. That gets around the OID conflict. -> -> It's still an open question whether OID+version is better than -> unique-ID for naming files that belong to different versions of the -> same relation. I can see arguments on both sides. -> - -I don't stick to unique-ID. My main point has always been the -transactional control of file allocation change. -However *VERSION(_ID)* may be misleading because it couldn't -mean the version of pg_class tuples. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - - -From tgl@sss.pgh.pa.us Wed Jun 28 12:10:59 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA11316 - for ; Wed, 28 Jun 2000 12:10:58 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA15790; - Wed, 28 Jun 2000 12:11:40 -0400 (EDT) -To: Bruce Momjian -cc: "Mikheev, Vadim" , - "'Peter Eisentraut'" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: <200006281425.KAA05633@candle.pha.pa.us> -References: <200006281425.KAA05633@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Wed, 28 Jun 2000 10:25:21 -0400" -Date: Wed, 28 Jun 2000 12:11:40 -0400 -Message-ID: <15787.962208700@sss.pgh.pa.us> -From: Tom Lane -Status: ROr - -Bruce Momjian writes: -> If we put multiple database tables in the same directory, have we -> considered how to drop databases? Right now we do rm -rf: - -rm -rf will no longer work in a tablespaces environment anyway. -(Even if you kept symlinks underneath the DB directory, rm -rf -wouldn't follow them.) - -DROP DATABASE will have to be implemented honestly: run through -pg_class and do a regular DROP on each user table. - -Once you've got rid of the user tables, rm -rf should suffice to -get rid of the "home tablespace" as I've been calling it, with -all the system tables therein. - -Now that you mention it, this is another reason why system tables for -each database have to live in a separate tablespace directory: there's -no other good way to do that final stage of DROP DATABASE. The -DROP-each-table approach doesn't work for system tables (somewhere along -about the point where you drop pg_attribute, DROP TABLE itself would -stop working ;-)). - -However I do see a bit of a problem here: since DROP DATABASE is -ordinarily executed by a backend that's running in a different database, -how's it going to read pg_class of the target database? Perhaps it will -be necessary to fire up a sub-backend that runs in the target DB for -long enough to kill all the user tables. Looking messy... - - regards, tom lane - -From pgsql-hackers-owner+M3998@hub.org Wed Jun 28 19:53:28 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA27612 - for ; Wed, 28 Jun 2000 19:53:27 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5SNqG142069; - Wed, 28 Jun 2000 19:52:17 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5SNp7137729 - for ; Wed, 28 Jun 2000 19:51:07 -0400 (EDT) -Received: from tpf.co.jp ([126.0.1.56] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP - id IAA03041; Thu, 29 Jun 2000 08:50:01 +0900 -Message-ID: <395A8FDF.1132EC6D@tpf.co.jp> -Date: Thu, 29 Jun 2000 08:53:03 +0900 -From: Hiroshi Inoue -X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U) -X-Accept-Language: ja -MIME-Version: 1.0 -To: Tom Lane -CC: Bruce Momjian , - "Mikheev, Vadim" , - "'Peter Eisentraut'" , - Thomas Lockhart , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -References: <16404.962213972@sss.pgh.pa.us> -Content-Type: text/plain; charset=iso-2022-jp -Content-Transfer-Encoding: 7bit -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Tom Lane wrote: - -> "Hiroshi Inoue" writes: -> > Why do we have to have system tables per *database* ? -> > Is there anything wrong with global system tables ? -> > And how about adding dbid to pg_class,pg_proc etc ? -> -> We could, but I think I'd vote against it on two grounds: -> -> 1. Reliability. If something corrupts pg_class, do you want to -> lose your whole installation, or just one database? -> -> 2. Increased locking overhead/loss of concurrency. Currently, there -> is very little lock contention between backends running in different -> databases. A shared pg_class will be a single point of locking (as -> well as a single point of failure) for the whole installation. - -Isn't current design of PG's *database* for dropdb using "rm -rf" -rather than for above 1.2. ? -If we couldn't rely on our db itself and our locking mechanism is -poor,we could start different postmasters for different *database*s. - - -> It would solve the DROP DATABASE problem kind of nicely, but really -> it'd just be downgrading DROP DATABASE to a DROP SCHEMA operation... -> - -What is our *DATABASE* ? -Is it clear to all people ? -At least it's a vague concept for me. -Could you please tell me what kind of objects are our *DATABASE* -objects but could not be schema objects ? - -Regards. - -Hiroshi Inoue - - - -From pgsql-hackers-owner+M4003@hub.org Thu Jun 29 10:41:19 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA28321 - for ; Thu, 29 Jun 2000 10:39:57 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5T7nr158743; - Thu, 29 Jun 2000 03:49:53 -0400 (EDT) -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5T7io146030 - for ; Thu, 29 Jun 2000 03:44:51 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id JAA46266; - Thu, 29 Jun 2000 09:43:20 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Thu, 29 Jun 2000 09:43:20 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA59A8@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Bruce Momjian'" -Cc: "Mikheev, Vadim" , - Hiroshi Inoue - , Tom Lane , - Thomas Lockhart - , - Peter Eisentraut , Jan Wieck , - PostgreSQL-development - , - "Ross J. Reedstrom" -Subject: AW: AW: [HACKERS] Big 7.1 open items -Date: Thu, 29 Jun 2000 09:43:14 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="windows-1252" -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - - -> > ln -s data/base/testdb/myspace/extent1 /var/myspace/extent1/testdb -> -> The idea was to put the main files in the directory, and create Extent2, -> Extent3 directories for the extents. - -The reasoning was, that the database subdir should be below the extentdir, -so that creating different fs for each extent would be easier, and not -depend -on the database name. - -It is easy to create fs for: - /var/myspace -or - /var/myspace[/extent1] - /var/myspace/extent2 -but not if it has dbname in it. - -Andreas - -From ZeugswetterA@wien.spardat.at Thu Jun 29 06:34:49 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA25201 - for ; Thu, 29 Jun 2000 06:34:44 -0400 (EDT) -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id GAA00379 for ; Thu, 29 Jun 2000 06:35:30 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id MAA33950; - Thu, 29 Jun 2000 12:33:42 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Thu, 29 Jun 2000 12:33:42 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA59AC@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Tom Lane'" -Cc: "'Bruce Momjian'" , - Peter Eisentraut - , - "Mikheev, Vadim" , - "'Hiroshi Inoue'" , - Thomas Lockhart - , - Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: AW: AW: [HACKERS] Big 7.1 open items -Date: Thu, 29 Jun 2000 12:33:39 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -Status: RO - - -> > > I think I would prefer the ability to place more than one -> > database into -> > > the same tablespace. -> > -> > You can put user tables from multiple databases into the same -> > tablespace, under this proposal. Just not system tables. -> -> Yes, but then it is only half baked. - -Half baked or not, I think I am starting to like it. -I think I would restrict such an automagically created tablespace -(tblspace name = db name) to only contain tables from this database. - -Andreas - -From pgsql-hackers-owner+M4019@hub.org Thu Jun 29 13:24:36 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA08070 - for ; Thu, 29 Jun 2000 13:24:35 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5THLf102550; - Thu, 29 Jun 2000 13:21:41 -0400 (EDT) -Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5THL1197262 - for ; Thu, 29 Jun 2000 13:21:01 -0400 (EDT) -Received: from regulus.student.UU.SE ([130.238.5.2]:50625 "EHLO - regulus.its.uu.se") by merganser.its.uu.se with ESMTP - id ; Thu, 29 Jun 2000 19:20:28 +0200 -Received: from peter (helo=localhost) - by regulus.its.uu.se with local-esmtp (Exim 3.02 #2) - id 137i5r-0000BK-00; Thu, 29 Jun 2000 19:27:15 +0200 -Date: Thu, 29 Jun 2000 19:27:15 +0200 (CEST) -From: Peter Eisentraut -To: Hiroshi Inoue -cc: Zeugswetter Andreas SB , - "'Mikheev, Vadim'" , - PostgreSQL-development -Subject: Re: AW: [HACKERS] Big 7.1 open items -In-Reply-To: <3959D7CF.E447565@tpf.co.jp> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=ISO-8859-1 -Content-Transfer-Encoding: 8BIT -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Hiroshi Inoue writes: - -> According to your another posting,your *database* hierarchy is -> instance -> database -> schema -> object -> like Oracle. -> -> However SQL92 seems to have another hierarchy: -> cluster -> catalog -> schema -> object -> and dot notation catalog.schema.object could be used. - -FYI: - -An "instance" is a "cluster". I don't know where the word instance came -from, the docs sometimes call it "installation" or "site", which is even -worse. I have been using "database cluster" for the latest documentation -work. My dictionary defines a cluster as "a group of things gathered or -occurring closely together", which is what this is. Call it a "data area" -or an "initdb'ed thing", etc. - -A "catalog" can be equated with our "database". The method of creating -catalogs is implementation defined, so our CREATE DATABASE command is in -perfect compliance with the standard. We don't support the -catalog.schema.object notation but that notation only makes sense when you -can access more than one catalog at a time. We don't allow that and SQL -doesn't require it. We could allow that notation and throw an error when -the catalog name doesn't match the current database, but that's mere -cosmetic work. - -In entry level SQL 92, a "schema" is essentially the same as table -ownership. You can execute the command CREATE SCHEMA AUTHORIZATION -"peter", which means that user "peter" (where he came from is -"implementation-defined") can now create tables under his name. There is -no such thing as a table owner, there's the "containing schema" and its -owner. The tables "peter" creates can then be referenced by the dotted -notation. But it is not correct to equate this with CREATE USER. Even if -there was no schema for "peter" he could still connect and query other -people's tables. - -Moving beyond SQL 92 you can also create schemas with a different name -than your user name. This is merely a little more naming flexibility. - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From peter@localhost.its.uu.se Thu Jun 29 19:25:40 2000 -Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA00202 - for ; Thu, 29 Jun 2000 19:25:39 -0400 (EDT) -Received: from regulus.student.UU.SE ([130.238.5.2]:52854 "EHLO - regulus.its.uu.se") by merganser.its.uu.se with ESMTP - id ; Fri, 30 Jun 2000 01:25:27 +0200 -Received: from peter (helo=localhost) - by regulus.its.uu.se with local-esmtp (Exim 3.02 #2) - id 137nnA-00023q-00; Fri, 30 Jun 2000 01:32:20 +0200 -Date: Fri, 30 Jun 2000 01:32:20 +0200 (CEST) -From: Peter Eisentraut -To: Tom Lane -cc: "Mikheev, Vadim" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <17726.962240702@sss.pgh.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=ISO-8859-1 -Content-Transfer-Encoding: 8BIT -Sender: Peter Eisentraut -Status: RO - -Tom Lane writes: - -> You can put *user* tables from more than one database into a table space. -> The restriction is just on *system* tables. - -I think my understanding as a user would be that a table space represents -a storage location. If I want to put a table/object/entire database on a -fancy disk somewhere I create a table space for it there. But if I want to -store all my stuff under /usr/local/pgsql/data then I wouldn't expect to -have to create more than one table space. So the table spaces become at -that point affected by the logical hierarchy: I must make sure to have -enough table spaces to have many databases. - -More specifically, what would the user interface to this look like? -Clearly there has to be some sort of CREATE TABLESPACE command. Now does -CREATE DATABASE imply a CREATE TABLESPACE? I think not. Do you have to -create a table space before creating each database? I think not. - -> We could avoid it along the lines you suggest (name table files like -> DBOID.RELOID.VERSION instead of just RELOID.VERSION) but is it really -> worth it? - -I only intended that for pg_class and other bootstrap-sort-of tables, -maybe all system tables. Normal heap files could look like RELOID.VERSION, -whereas system tables would look like "name.DBOID". Clearly there's no -market for renaming system tables or dropping any of their columns. We're -obviously going to have to treat pg_class special anyway. - -> Vadim's concerned about every byte that has to go into the WAL log, -> and I think he's got a good point. - -True. But if you only do it for the system tables then it might take less -space than keeping track of lots of table spaces that are unneeded. :-) - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - - -From pgsql-hackers-owner+M4032@hub.org Thu Jun 29 20:12:39 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA00852 - for ; Thu, 29 Jun 2000 20:12:38 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5TNwm184774; - Thu, 29 Jun 2000 19:58:48 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5TNvD180670 - for ; Thu, 29 Jun 2000 19:57:14 -0400 (EDT) -Received: from tpf.co.jp ([126.0.1.56] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP - id IAA04081; Fri, 30 Jun 2000 08:56:46 +0900 -Message-ID: <395BE2F5.687E90B0@tpf.co.jp> -Date: Fri, 30 Jun 2000 08:59:49 +0900 -From: Hiroshi Inoue -X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U) -X-Accept-Language: ja -MIME-Version: 1.0 -To: Peter Eisentraut -CC: Zeugswetter Andreas SB , - "'Mikheev, Vadim'" , - PostgreSQL-development -Subject: Re: AW: [HACKERS] Big 7.1 open items -References: -Content-Type: text/plain; charset=iso-2022-jp -Content-Transfer-Encoding: 7bit -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Peter Eisentraut wrote: - -> Hiroshi Inoue writes: -> -> > According to your another posting,your *database* hierarchy is -> > instance -> database -> schema -> object -> > like Oracle. -> > -> > However SQL92 seems to have another hierarchy: -> > cluster -> catalog -> schema -> object -> > and dot notation catalog.schema.object could be used. -> -> FYI: - -Thanks. -I'm asking to all what our *DATABASE* is. -Different from you,I couldn't see any decisive feature in our *DATABASE*. - -> -> -> An "instance" is a "cluster". I don't know where the word instance came - -I could find the word in Oracle. -IMHO,it corresponds to our initdb'ed thing(a postmaster controls). - -> -> from, the docs sometimes call it "installation" or "site", which is even -> worse. I have been using "database cluster" for the latest documentation -> work. My dictionary defines a cluster as "a group of things gathered or -> occurring closely together", which is what this is. Call it a "data area" -> or an "initdb'ed thing", etc. -> - -SQL92 seems to say that a cluster corresponds to a target of connection -and has no name(after connection was established). Isn't it same as our -*DATABASE* ? - -> -> A "catalog" can be equated with our "database". The method of creating -> catalogs is implementation defined, so our CREATE DATABASE command is in -> perfect compliance with the standard. We don't support the -> catalog.schema.object notation but that notation only makes sense when you -> can access more than one catalog at a time. - -Yes,it's most essential that we couldn't access more than one catalog. -This means that we have only one (noname) "catalog" per "cluster". - -> We don't allow that and SQL -> doesn't require it. We could allow that notation and throw an error when -> the catalog name doesn't match the current database, but that's mere -> cosmetic work. -> -> In entry level SQL 92, a "schema" is essentially the same as table -> ownership. You can execute the command CREATE SCHEMA AUTHORIZATION -> "peter", which means that user "peter" (where he came from is -> "implementation-defined") can now create tables under his name. There is -> no such thing as a table owner, there's the "containing schema" and its -> owner. The tables "peter" creates can then be referenced by the dotted -> notation. But it is not correct to equate this with CREATE USER. Even if -> there was no schema for "peter" he could still connect and query other -> people's tables. -> - -I've used *username* "schema"s in Oracle for a long time but I've never -thought that it's the essence of "schema". If I recoginze correctly,the -concept of "catalog" hasn't necessarily been important while "schema" -= "user". The conflict of "schema" name is equivalent to the conflict -of "user" name if "schema" = "user". IMHO,SQL92 has required the -concept of "catalog" because "schema" has been changed to be -independent of "user". - -Anyway in current PG "cluster":"catalog":"schema"=1:1:1(0) and -our *DATABASE* is an only confusing concept in the hierarchy.. - -Regards, - -Hiroshi Inoue - - - -From tgl@sss.pgh.pa.us Thu Jun 29 20:42:56 2000 -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA00958 - for ; Thu, 29 Jun 2000 20:42:55 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id UAA02520; - Thu, 29 Jun 2000 20:43:32 -0400 (EDT) -To: Peter Eisentraut -cc: "Mikheev, Vadim" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: -References: -Comments: In-reply-to Peter Eisentraut - message dated "Fri, 30 Jun 2000 01:32:20 +0200" -Date: Thu, 29 Jun 2000 20:43:32 -0400 -Message-ID: <2517.962325812@sss.pgh.pa.us> -From: Tom Lane -Status: RO - -Peter Eisentraut writes: -> Tom Lane writes: ->> You can put *user* tables from more than one database into a table space. ->> The restriction is just on *system* tables. - -> More specifically, what would the user interface to this look like? -> Clearly there has to be some sort of CREATE TABLESPACE command. Now does -> CREATE DATABASE imply a CREATE TABLESPACE? I think not. Do you have to -> create a table space before creating each database? I think not. - -I would say that CREATE DATABASE just implicitly creates a new -tablespace that's physically located right under the toplevel data -directory of the installation, no symlink. What's wrong with that? -You need not keep anything except the system tables of the DB there -if you don't want to. In practice, for someone who doesn't need to -worry about tablespaces (because they put the installation on a disk -with enough room for their purposes), the whole thing acts exactly -the same as it does now. - ->> We could avoid it along the lines you suggest (name table files like ->> DBOID.RELOID.VERSION instead of just RELOID.VERSION) but is it really ->> worth it? - -> I only intended that for pg_class and other bootstrap-sort-of tables, -> maybe all system tables. Normal heap files could look like RELOID.VERSION, -> whereas system tables would look like "name.DBOID". - -That would imply that the very bottom levels of the system know all -about which tables are system tables and which are not (and, if you -are really going to insist on the "name" part of that, that they -know what name goes with each system-table OID). I'd prefer to avoid -that. The less the smgr knows about the upper levels of the system, -the better. - -> Clearly there's no market for renaming system tables or dropping any -> of their columns. - -No, but there is a market for compacting indexes on system relations, -and I haven't heard a good proposal for doing index compaction in place. -So we need versioning for system indexes. - ->> Vadim's concerned about every byte that has to go into the WAL log, ->> and I think he's got a good point. - -> True. But if you only do it for the system tables then it might take less -> space than keeping track of lots of table spaces that are unneeded. :-) - -Again, WAL should not need to distinguish system and user tables. - -And as for the keeping track, the tablespace OID will simply replace the -database OID in the log and in the smgr interfaces. There's no "extra" -cost, except maybe by comparison to a system with neither tablespaces -nor multiple databases. - - regards, tom lane - -From peter@localhost.its.uu.se Sat Jul 1 10:39:11 2000 -Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA02996 - for ; Sat, 1 Jul 2000 10:39:10 -0400 (EDT) -Received: from regulus.student.UU.SE ([130.238.5.2]:50862 "EHLO - regulus.its.uu.se") by merganser.its.uu.se with ESMTP - id ; Sat, 1 Jul 2000 16:56:49 +0200 -Received: from peter (helo=localhost) - by regulus.its.uu.se with local-esmtp (Exim 3.02 #2) - id 138Oo3-0003UQ-00; Sat, 01 Jul 2000 17:03:43 +0200 -Date: Sat, 1 Jul 2000 17:03:42 +0200 (CEST) -From: Peter Eisentraut -To: Tom Lane -cc: "Mikheev, Vadim" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <2517.962325812@sss.pgh.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=ISO-8859-1 -Content-Transfer-Encoding: 8BIT -Sender: Peter Eisentraut -Status: RO - -Tom Lane writes: - -> In practice, for someone who doesn't need to worry about tablespaces -> (because they put the installation on a disk with enough room for -> their purposes), the whole thing acts exactly the same as it does now. - -But I'd venture the guess that for someone who wants to use tablespaces it -wouldn't work as expected. Table spaces should represent a physical -storage location. Creation of table spaces should be a restricted -operation, possibly more than, but at least differently from, databases. -Eventually, table spaces probably will have attributes, such as -optimization parameters (random_page_cost). This will not work as expected -if you intermix them with the databases. - -I'd expect that if I have three disks and 50 databases, then I make three -tablespaces and assign the databases to them. I'll bet lunch that if we -don't do it that way that before long people will come along and ask for -something that does work this way. - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From pgsql-hackers-owner+M4066@hub.org Sat Jul 1 13:21:39 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA03777 - for ; Sat, 1 Jul 2000 13:21:38 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e61He8S63312; - Sat, 1 Jul 2000 13:40:08 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e61Hd7S58820 - for ; Sat, 1 Jul 2000 13:39:07 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA22822; - Sat, 1 Jul 2000 13:37:21 -0400 (EDT) -To: Peter Eisentraut -cc: "Mikheev, Vadim" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-reply-to: -References: -Comments: In-reply-to Peter Eisentraut - message dated "Sat, 01 Jul 2000 17:03:42 +0200" -Date: Sat, 01 Jul 2000 13:37:21 -0400 -Message-ID: <22819.962473041@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Peter Eisentraut writes: -> I'd expect that if I have three disks and 50 databases, then I make three -> tablespaces and assign the databases to them. - -In our last installment, you were complaining that you didn't want to -be bothered with that ;-) - -But I don't see any reason why CREATE DATABASE couldn't take optional -parameters indicating where to create the new DB's default tablespace. -We already have a LOCATION option for it that does something close to -that. - -Come to think of it, it would probably make sense to adapt the existing -notion of "location" (cf initlocation script) into something meaning -"directory that users are allowed to create tablespaces (including -databases) in". If there were an explicit table of allowed locations, -it could be used to address the protection issues you raise --- for -example, a location could be restricted so that only some users could -create tablespaces/databases in it. $PGDATA/data would be just the -first location in every installation. - - regards, tom lane - -From pgsql-hackers-owner+M4078@hub.org Sun Jul 2 11:16:52 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA14294 - for ; Sun, 2 Jul 2000 11:16:51 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e62FGqS51200; - Sun, 2 Jul 2000 11:16:52 -0400 (EDT) -Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) - by hub.org (8.10.1/8.10.1) with ESMTP id e62FGaS50925 - for ; Sun, 2 Jul 2000 11:16:36 -0400 (EDT) -Received: from regulus.student.UU.SE ([130.238.5.2]:52424 "EHLO - regulus.its.uu.se") by merganser.its.uu.se with ESMTP - id ; Sun, 2 Jul 2000 17:15:57 +0200 -Received: from peter (helo=localhost) - by regulus.its.uu.se with local-esmtp (Exim 3.02 #2) - id 138lZz-0001VD-00; Sun, 02 Jul 2000 17:22:43 +0200 -Date: Sun, 2 Jul 2000 17:22:43 +0200 (CEST) -From: Peter Eisentraut -To: Tom Lane -cc: "Mikheev, Vadim" , - "'Hiroshi Inoue'" , - Thomas Lockhart , - Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: Re: [HACKERS] Big 7.1 open items -In-Reply-To: <22819.962473041@sss.pgh.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=ISO-8859-1 -Content-Transfer-Encoding: 8BIT -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Tom Lane writes: - -> Come to think of it, it would probably make sense to adapt the existing -> notion of "location" (cf initlocation script) into something meaning -> "directory that users are allowed to create tablespaces (including -> databases) in". - -This is what I've been trying to push all along. But note that this -mechanism does allow multiple databases per location. :) - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From ZeugswetterA@wien.spardat.at Mon Jul 3 04:30:07 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA16088 - for ; Mon, 3 Jul 2000 04:30:05 -0400 (EDT) -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id EAA19031 for ; Mon, 3 Jul 2000 04:30:07 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA28416; - Mon, 3 Jul 2000 10:28:06 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Mon, 3 Jul 2000 10:28:06 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA59B0@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Hiroshi Inoue'" , - Peter Eisentraut - , Tom Lane -Cc: Bruce Momjian , Jan Wieck , - PostgreSQL-development , - "Ross J. Reedstrom" -Subject: AW: [HACKERS] Big 7.1 open items -Date: Mon, 3 Jul 2000 10:28:05 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="windows-1252" -Status: RO - - -> > > > > In my mind the point of the "database" concept is to -> > > provide a domain -> > > > > within which custom datatypes and functions are available. -> > > > -> > > -> > > AFAIK few users understand it and many users have wondered -> > > why we couldn't issue cross "database" queries. -> > -> > Imho the same issue is access to tables on another machine. -> > If we "fix" that, access to another db on the same instance is just -> > a variant of the above. -> > -> -> What is a difference between SCHAMA and your "database" ? -> I myself am confused about them. - -"my *database*" corresponds to the current database, which is created with -"create database" in postgresql. It corresponds to the catalog concept in -SQL99. - -The schema is below the database. Access to different schemas with one -connection -is mandatory. Access to different catalogs (databases) with one connection -is not mandatory, -but should imho be solved analogous to access to another catalog on a -different -(SQL99) cluster. This would be a very nifty feature. - -Andreas - -From pgsql-hackers-owner+M3496@hub.org Fri Jun 16 15:55:14 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA02116 - for ; Fri, 16 Jun 2000 14:55:13 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id NAA21581 for ; Fri, 16 Jun 2000 13:53:58 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5GHpqN06086; - Fri, 16 Jun 2000 13:51:52 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5GHpcN05946 - for ; Fri, 16 Jun 2000 13:51:39 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA07945 - for ; Fri, 16 Jun 2000 13:51:38 -0400 (EDT) -To: pgsql-hackers@postgresql.org -Subject: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename -Date: Fri, 16 Jun 2000 13:51:37 -0400 -Message-ID: <7942.961177897@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -After further thought I think there's a lot of merit in Hiroshi's -opinion that physical file names should not be tied to relation OID. -If we use a separately generated value for the file name, we can -solve a lot of problems pretty nicely by means of "table versioning". - -For example: VACUUM can't compact indexes at the moment, and what it -does do (scan the index and delete unused entries) is really slow. -The right thing to do is for it to generate an all-new index file, -but how do we do that without creating a risk of leaving the index -corrupted if we crash partway through? The answer is to build the -new index in a new physical file. But how do we install the new -file as the real index atomically, when it might span multiple -segments? If the physical file name is decoupled from the relation's -name *and* OID then there is no problem: the atomic event that makes -the new file(s) the real table contents is the commit of the new -pg_class row with the new value for the physical filename. - -Aside from possible improvements in VACUUM, this would let us do a -robust implementation of CLUSTER, and we could do the "really change -the table" variant of ALTER TABLE DROP COLUMN the same way if anyone -wants to do it. - -The only cost is that we need an additional column in pg_class to -hold the physical file name. That's not so bad, especially when -you remember that we'd surely need to add something to pg_class for -tablespace support anyway. - -If we bite that bullet, then we could also do something to satisfy -Bruce about having legible file names ;-). The column in pg_class -could perfectly well be a string, not a pure number, and that means -that we can throw in the relname (truncated to fit of course). So -the thing would act a lot like the original-relname-plus-OID variant -that's been discussed so far. (Original relname because ALTER TABLE -RENAME would *not* change the physical file name. But we could -think about a form of VACUUM that creates a whole new table by -versioning, and that would presumably bring the physical name back -in sync with the logical relname.) - -Here is a sketch of a concrete proposal. I see no need to have -separate pg_class columns for tablespace and physical relname; -instead, I suggest there be a column of type NAME that is the -file pathname (relative to the database directory). Further, -instead of the existing convention of appending .N to the base -file name to make extension segment names, I propose that we -always have a segment number in the physical file name, and that -the pg_class entry be required to contain a "%d" somewhere that -indicates where. The actual filename is manufactured by - sprintf(tempbuf, value_from_pg_class_column, segment_number); - -As an example, the arrangement I was suggesting earlier today -about segments in different subdirectories of a tablespace -could be implemented by assigning physical filenames like - - tablespace/%d/12345_relname - -where the 12345 is a value generated separately from the table's OID. -(We would still use the OID counter to produce these numbers, and -in fact there's no reason not to use the table's OID as the initial -unique ID for the physical filename. The point is just that the -physical filename doesn't have to remain forever equal to the -relation's OID.) - -If we use type NAME for this string then the tablespace part of the path -would have to be kept to no more than ~ 15 characters, but that seems -workable enough. (Anybody who really didn't like that could recompile -with larger NAMEDATALEN. Doesn't seem worth inventing a separate type.) - -As Hiroshi pointed out, one of the best aspects of this approach -is that the physical table layout policy doesn't have to be hard-wired -into low-level file access routines. The low-level routines don't -need to know much of anything about the format of the pathname, -they just stuff in the right segment number and use the name. The -layout policy need only be known to one single routine that generates -the strings that go into pg_class. So it'd be really easy to change. - -One thing we'd have to work out is that the critical system tables -(eg, pg_class itself, as well as its indexes) would have to have -predictable physical names. Otherwise there's no way for a new -backend to bootstrap itself up ... it can't very well read pg_class -to find out where pg_class is. A brute-force solution is to forbid -reversioning of the critical tables, but I suspect we can find a -less restrictive answer. - -This seems like it'd satisfy all the concerns that have been raised. -Comments? - - regards, tom lane - -From pgsql-hackers-owner+M3524@hub.org Fri Jun 16 22:30:59 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07796 - for ; Fri, 16 Jun 2000 21:30:58 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id VAA26393 for ; Fri, 16 Jun 2000 21:16:37 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5H1EeM94683; - Fri, 16 Jun 2000 21:14:40 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5H1D0M94365 - for ; Fri, 16 Jun 2000 21:13:00 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id VAA10209; - Fri, 16 Jun 2000 21:12:30 -0400 (EDT) -To: Chris Bitmead -cc: pgsql-hackers@postgreSQL.org -Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename -In-reply-to: <394ACB42.C87C59B8@bitmead.com> -References: <7942.961177897@sss.pgh.pa.us> <394ACB42.C87C59B8@bitmead.com> -Comments: In-reply-to Chris Bitmead - message dated "Sat, 17 Jun 2000 10:50:10 +1000" -Date: Fri, 16 Jun 2000 21:12:29 -0400 -Message-ID: <10206.961204349@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Chris Bitmead writes: -> At least on UNIX, couldn't you use a hard-link and change the name in -> pg_class immediately? Let the brain-dead operating systems use the -> vacuum method. - -Hmm ... maybe, but it doesn't seem worth the portability headache to -me. We do have an NT port that we don't want to break, and I don't -think RENAME TABLE is worth the trouble of testing/supporting two -implementations. - -Even on Unix, aren't there filesystems that don't do hard links? -Not that I'd recommend running Postgres on such a volume, but... - - regards, tom lane - -From pgsql-hackers-owner+M3525@hub.org Sat Jun 17 07:01:03 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA22194 - for ; Sat, 17 Jun 2000 06:01:02 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id FAA21836 for ; Sat, 17 Jun 2000 05:39:21 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5H9bSM88777; - Sat, 17 Jun 2000 05:37:28 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5H9anM88603 - for ; Sat, 17 Jun 2000 05:36:49 -0400 (EDT) -Received: from mcadnote1 (ppm130.noc.fukui.nsk.ne.jp [210.161.188.49]) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id SAA08384; Sat, 17 Jun 2000 18:36:00 +0900 -From: "Hiroshi Inoue" -To: "Tom Lane" -Cc: -Subject: RE: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename -Date: Sat, 17 Jun 2000 18:38:53 +0900 -Message-ID: -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-2022-jp" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) -In-Reply-To: <7942.961177897@sss.pgh.pa.us> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 -Importance: Normal -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On -> Behalf Of Tom Lane -> -> After further thought I think there's a lot of merit in Hiroshi's -> opinion that physical file names should not be tied to relation OID. -> If we use a separately generated value for the file name, we can -> solve a lot of problems pretty nicely by means of "table versioning". -> -> For example: VACUUM can't compact indexes at the moment, and what it -> does do (scan the index and delete unused entries) is really slow. -> The right thing to do is for it to generate an all-new index file, -> but how do we do that without creating a risk of leaving the index -> corrupted if we crash partway through? The answer is to build the -> new index in a new physical file. But how do we install the new -> file as the real index atomically, when it might span multiple -> segments? If the physical file name is decoupled from the relation's -> name *and* OID then there is no problem: the atomic event that makes -> the new file(s) the real table contents is the commit of the new -> pg_class row with the new value for the physical filename. -> -> Aside from possible improvements in VACUUM, this would let us do a -> robust implementation of CLUSTER, and we could do the "really change -> the table" variant of ALTER TABLE DROP COLUMN the same way if anyone -> wants to do it. -> - -Yes,I've wondered how do we implement column_is_really_dropped -ALTER TABLE DROP COLUMN feature without this kind of mechanism. - -> The only cost is that we need an additional column in pg_class to -> hold the physical file name. That's not so bad, especially when -> you remember that we'd surely need to add something to pg_class for -> tablespace support anyway. -> -> If we bite that bullet, then we could also do something to satisfy -> Bruce about having legible file names ;-). The column in pg_class -> could perfectly well be a string, not a pure number, and that means -> that we can throw in the relname (truncated to fit of course). So -> the thing would act a lot like the original-relname-plus-OID variant -> that's been discussed so far. (Original relname because ALTER TABLE -> RENAME would *not* change the physical file name. But we could -> think about a form of VACUUM that creates a whole new table by -> versioning, and that would presumably bring the physical name back -> in sync with the logical relname.) -> -> As Hiroshi pointed out, one of the best aspects of this approach -> is that the physical table layout policy doesn't have to be hard-wired -> into low-level file access routines. The low-level routines don't -> need to know much of anything about the format of the pathname, -> they just stuff in the right segment number and use the name. The -> layout policy need only be known to one single routine that generates -> the strings that go into pg_class. So it'd be really easy to change. -> - -Ross's approach is fundamentally same though he is using relname+OID -naming rule. I've said his trial is most practical one. - -> One thing we'd have to work out is that the critical system tables -> (eg, pg_class itself, as well as its indexes) would have to have -> predictable physical names. - -The only limitation of the relation filename is the uniqueness. -So it doesn't introduce any inconsistency that system tables -have fixed name. -As for system relations it wouldn't be so bad because CLUSTER/ -ALTER TABLE DROP COLUMN ... would be unnecessary(maybe). -But as for system indexes,it is preferable that VACUUM/REINDEX -could rebuild them safely. System indexes never shrink currently. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - -From pgsql-hackers-owner+M3529@hub.org Sat Jun 17 10:01:24 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA24004 - for ; Sat, 17 Jun 2000 09:01:23 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id IAA28633 for ; Sat, 17 Jun 2000 08:57:47 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5HCtxM77095; - Sat, 17 Jun 2000 08:55:59 -0400 (EDT) -Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5HCtoM77026 - for ; Sat, 17 Jun 2000 08:55:50 -0400 (EDT) -Received: from regulus.student.UU.SE ([130.238.5.2]:57716 "EHLO - regulus.its.uu.se") by merganser.its.uu.se with ESMTP - id ; Sat, 17 Jun 2000 14:55:25 +0200 -Received: from peter (helo=localhost) - by regulus.its.uu.se with local-esmtp (Exim 3.02 #2) - id 133IET-0002Y3-00; Sat, 17 Jun 2000 15:01:53 +0200 -Date: Sat, 17 Jun 2000 15:01:53 +0200 (CEST) -From: Peter Eisentraut -To: Tom Lane -cc: pgsql-hackers@postgreSQL.org -Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated - filename -In-Reply-To: <7942.961177897@sss.pgh.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=ISO-8859-1 -Content-Transfer-Encoding: 8BIT -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Tom Lane writes: - -> tablespace/%d/12345_relname - -Throwing table spaces and relation names into one pot doesn't excite me -very much. For example, before long people will want to - -* Query what tables are in what space (without using string operations) -Consider for example creating a new table and choosing where to put it. - -* Rename table spaces - -* Assign attributes of some sort to table spaces (permissions, etc.) - -* Use table space names with more than 15 characters. :) - -Somehow table spaces need to be catalogued. You could still make the -physical file name 'tablespaceoid/rest' without actually having to look up -anything, although that depends on your symlink idea which is still under -discussion. - -Then, why are all nth segments of tables in one directory in that -proposal? - -Also, you said before that an old relname (after rename) is worse than -none at all. I couldn't agree more. - -Why not use OID.[SEGMENT.]VERSION for the physical relname (different -order possible)? That way you at least have some guaranteed correspondence -between files and tables. Version could probably be an INT2, so you save -some space. - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From pgsql-hackers-owner+M3534@hub.org Sat Jun 17 13:31:11 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA02801 - for ; Sat, 17 Jun 2000 12:31:10 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA07848 for ; Sat, 17 Jun 2000 12:27:14 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5HGPJM95074; - Sat, 17 Jun 2000 12:25:19 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5HGP1M94990 - for ; Sat, 17 Jun 2000 12:25:01 -0400 (EDT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA18939; - Sat, 17 Jun 2000 12:24:56 -0400 (EDT) -To: "Hiroshi Inoue" -cc: pgsql-hackers@postgreSQL.org -Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename -In-reply-to: -References: -Comments: In-reply-to "Hiroshi Inoue" - message dated "Sat, 17 Jun 2000 18:38:53 +0900" -Date: Sat, 17 Jun 2000 12:24:56 -0400 -Message-ID: <18936.961259096@sss.pgh.pa.us> -From: Tom Lane -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -"Hiroshi Inoue" writes: ->> One thing we'd have to work out is that the critical system tables ->> (eg, pg_class itself, as well as its indexes) would have to have ->> predictable physical names. - -> The only limitation of the relation filename is the uniqueness. -> So it doesn't introduce any inconsistency that system tables -> have fixed name. -> As for system relations it wouldn't be so bad because CLUSTER/ -> ALTER TABLE DROP COLUMN ... would be unnecessary(maybe). -> But as for system indexes,it is preferable that VACUUM/REINDEX -> could rebuild them safely. System indexes never shrink currently. - -Right, it's the index-shrinking business that has me worried. -Most of the other reasons for swapping in a new file don't apply -to system tables, but that one does. - -One possibility is to say that system *tables* can't be reversioned -(at least not the critical ones) but system *indexes* can be. -Then we'd have to use your ignore-system-indexes stuff during backend -startup, until we'd found out where the indexes are. Might be too big -a time penalty however... not sure. Shared cache inval of a system -index could be a little tricky too; I don't think the catcache routines -are prepared to fall back to non-index scan are they? - -On the whole it might be better to cheat by using a side data structure -like the pg_internal.init file, that a backend could consult to find out -where the indexes are now. - - regards, tom lane - -From pgsql-hackers-owner+M3553@hub.org Sun Jun 18 18:31:03 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA08740 - for ; Sun, 18 Jun 2000 17:31:02 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id RAA18332 for ; Sun, 18 Jun 2000 17:21:51 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5ILJcM11720; - Sun, 18 Jun 2000 17:19:38 -0400 (EDT) -Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5ILILM09628 - for ; Sun, 18 Jun 2000 17:18:21 -0400 (EDT) -Received: from regulus.student.UU.SE ([130.238.5.2]:40239 "EHLO - regulus.its.uu.se") by merganser.its.uu.se with ESMTP - id ; Sun, 18 Jun 2000 23:17:49 +0200 -Received: from peter (helo=localhost) - by regulus.its.uu.se with local-esmtp (Exim 3.02 #2) - id 133mYM-0000Ns-00; Sun, 18 Jun 2000 23:24:26 +0200 -Date: Sun, 18 Jun 2000 23:24:26 +0200 (CEST) -From: Peter Eisentraut -To: Tom Lane -cc: PostgreSQL Development -Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated - filename -In-Reply-To: <19045.961260445@sss.pgh.pa.us> -Message-ID: -MIME-Version: 1.0 -Content-Type: TEXT/PLAIN; charset=ISO-8859-1 -Content-Transfer-Encoding: 8BIT -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -Tom Lane writes: - -> I don't think it's a good idea to have to consult pg_tablespace to find -> out where a table actually is --- I think the pathname (or smgr access -> token as Ross would call it ;-)) ought to be determinable from just the -> pg_class entry. - -That's why I suggested the table space oid. That would be readily -available from pg_class. - - -> Tablespaces can have logical names stored in pg_tablespace; they just -> can't contribute more than a dozen or so characters to file pathnames -> under the implementation I'm proposing. That doesn't seem too -> unreasonable; the pathname part can be some sort of abbreviated name. - -Since the abbreviated name is really only used internally it might as well -be the oid. Otherwise you create a weird functional dependency like the -pg_shadow.usesysid field that's just an extra layer of maintenance. - - -> this implementation mechanism will support either policy choice --- -> original relname in the filename, or just a numeric ID for the -> filename - -But when you look at a file name `12345_accounts_recei' you know neither - -* whether the table name was really `accounts_recei' or whether the name -was truncated - -* whether the table still has that name, whatever it was - -* what table this is at all - -So in the aggregate you really know less than nothing. :-) - - -> > Why not use OID.[SEGMENT.]VERSION for the physical relname (different -> > order possible)? -> -> Doesn't give you a manageable way to split segments across different -> disks. - -Okay, so maybe ${base}/TABLESPACEOID/SEGMENT/RELOID.VERSION. - -This doesn't need any catalog lookup outside of pg_class, yet it's still -easy to resolve to human-readable names by simple admin tools (SELECT * -FROM pg_foo WHERE oid = xxx). VERSION would be unique within a conceptual -relation, so you could even see how many times the relation was altered in -major ways (kind of). - - --- -Peter Eisentraut Sernanders väg 10:115 -peter_e@gmx.net 75262 Uppsala -http://yi.org/peter-e/ Sweden - - -From pgsql-hackers-owner+M3561@hub.org Sun Jun 18 21:31:03 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA20523 - for ; Sun, 18 Jun 2000 20:31:02 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA25719 for ; Sun, 18 Jun 2000 20:26:49 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5J0OLM53050; - Sun, 18 Jun 2000 20:24:21 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5J0NmM50883 - for ; Sun, 18 Jun 2000 20:23:49 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id JAA09003; Mon, 19 Jun 2000 09:22:45 +0900 -From: "Hiroshi Inoue" -To: "Chris Bitmead" , "Tom Lane" -Cc: "Peter Eisentraut" , -Subject: RE: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename -Date: Mon, 19 Jun 2000 09:24:56 +0900 -Message-ID: <000901bfd984$cbf1dfc0$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="ISO-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -In-Reply-To: <394C20C6.9580A8A9@bitmead.com> -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Importance: Normal -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On -> Behalf Of Chris Bitmead -> -> Tom Lane wrote: -> -> > > Also, you said before that an old relname (after rename) is worse than -> > > none at all. I couldn't agree more. -> > -> > I'm not the one who wants relnames in the physical names ;-). However, -> > this implementation mechanism will support either policy choice --- -> > original relname in the filename, or just a numeric ID for the filename -> > --- and that seems like a good sign to me. -> > -> > > Why not use OID.[SEGMENT.]VERSION for the physical relname (different -> > > order possible)? -> -> Unless VERSION is globally unique like an oid is, having RELNAME.VERSION -> would be a problem if you created a table with the same name as a -> recently renamed table. -> - -In my proposal(relname+unique-id),the unique-id is globally unique -and relname is only for dba's convenience. I've said many times that -we should be free from the rule of file naming as far as possible. -I myself don't mind the name of relation files except that they should -be globally unique. I had to propose my opinion for file naming -because people have been so enthusiastic about globally_not_unique -file naming. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - -From pgsql-hackers-owner+M3523@hub.org Fri Jun 16 22:01:00 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07568 - for ; Fri, 16 Jun 2000 21:00:59 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA25354 for ; Fri, 16 Jun 2000 20:54:02 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5H0q3M53458; - Fri, 16 Jun 2000 20:52:03 -0400 (EDT) -Received: from tech.com.au (IDENT:root@techpt.lnk.telstra.net [139.130.75.122]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5H0oRM47761 - for ; Fri, 16 Jun 2000 20:50:28 -0400 (EDT) -Received: from bitmead.com (IDENT:chris@tardis [203.41.180.243]) - by tech.com.au (8.9.3/8.9.3) with ESMTP id KAA21482; - Sat, 17 Jun 2000 10:50:14 +1000 -Message-ID: <394ACB42.C87C59B8@bitmead.com> -Date: Sat, 17 Jun 2000 10:50:10 +1000 -From: Chris Bitmead -X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.14-5.0 i686) -X-Accept-Language: en -MIME-Version: 1.0 -To: Tom Lane -CC: pgsql-hackers@postgreSQL.org -Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated - filename -References: <7942.961177897@sss.pgh.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: ROr - -Tom Lane wrote: - So -> the thing would act a lot like the original-relname-plus-OID variant -> that's been discussed so far. (Original relname because ALTER TABLE -> RENAME would *not* change the physical file name. But we could -> think about a form of VACUUM that creates a whole new table by -> versioning, and that would presumably bring the physical name back -> in sync with the logical relname.) - -At least on UNIX, couldn't you use a hard-link and change the name in -pg_class immediately? Let the brain-dead operating systems use the -vacuum method. - -From pgsql-hackers-owner+M3576@hub.org Mon Jun 19 01:58:35 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA00789 - for ; Mon, 19 Jun 2000 00:58:34 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5J4qfM87650; - Mon, 19 Jun 2000 00:52:41 -0400 (EDT) -Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5J4oUM77400 - for ; Mon, 19 Jun 2000 00:50:30 -0400 (EDT) -Received: from cadzone ([126.0.1.40] (may be forged)) - by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP - id NAA09265; Mon, 19 Jun 2000 13:50:22 +0900 -From: "Hiroshi Inoue" -To: "Peter Eisentraut" -Cc: "PostgreSQL Development" , - "Tom Lane" -Subject: RE: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generatedfilename -Date: Mon, 19 Jun 2000 13:52:34 +0900 -Message-ID: <001201bfd9aa$2f1c1320$2801007e@tpf.co.jp> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="ISO-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 -In-Reply-To: -X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 -Importance: Normal -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - -> -----Original Message----- -> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On -> Behalf Of Peter Eisentraut -> -> Tom Lane writes: -> -> > I don't think it's a good idea to have to consult pg_tablespace to find -> > out where a table actually is --- I think the pathname (or smgr access -> > token as Ross would call it ;-)) ought to be determinable from just the -> > pg_class entry. -> -> That's why I suggested the table space oid. That would be readily -> available from pg_class. -> - -It seems to me that the following 1)2) has always been mixed up. -IMHO,they should be distinguished clearly. - -1) Where the table is stored - Currently PostgreSQL relies on relname -> filename mapping - rule to access *existent* relations and doesn't have this - information in its database. Our(Tom,Ross,me) proposal is to - keep the information(token) in pg_class and provide a standard - transactional control mechanism for the change of table file - allocation. By doing it we would be able to be free from table - allocation(naming) rule. - Isn't it a kind of thing why we haven't had it from the first ? - -2) Where to store the table - Yes,TABLE(DATA)SPACE should encapsulate this concept. - -I want the decision about 1) first. Ross has already tried it without -2). - -Comments ? - -As for 2) every one seems to have each opinion and the discussion -has always been divergent. Please don't discard 1) together. - -Regards. - -Hiroshi Inoue -Inoue@tpf.co.jp - - -From pgsql-hackers-owner+M3591@hub.org Mon Jun 19 11:01:19 2000 -Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA21409 - for ; Mon, 19 Jun 2000 10:01:18 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id JAA05383 for ; Mon, 19 Jun 2000 09:56:59 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e5JDsVM91574; - Mon, 19 Jun 2000 09:54:31 -0400 (EDT) -Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) - by hub.org (8.10.1/8.10.1) with ESMTP id e5JDldM77267 - for ; Mon, 19 Jun 2000 09:48:05 -0400 (EDT) -Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16]) - by gandalf.it-austria.net (xxx/xxx) with ESMTP id PAA80686; - Mon, 19 Jun 2000 15:46:24 +0200 -Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0) - id ; Mon, 19 Jun 2000 15:46:24 +0200 -Message-ID: <219F68D65015D011A8E000006F8590C605BA5978@sdexcsrv1.f000.d0188.sd.spardat.at> -From: Zeugswetter Andreas SB -To: "'Tom Lane'" , Peter Eisentraut -Cc: pgsql-hackers@postgresql.org -Subject: AW: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated - filename -Date: Mon, 19 Jun 2000 15:46:22 +0200 -MIME-Version: 1.0 -X-Mailer: Internet Mail Service (5.5.2448.0) -Content-Type: text/plain; - charset="iso-8859-1" -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: RO - - -> It's better than *all* segments of tables in one directory, which is -> what you get if the segment number is just a component of a flat file -> name. We have to have a better answer than that for people who need -> to cope with tables bigger than a disk. Perhaps someone can -> think of a -> better answer than subdirectory-per-segment-number, but I think that -> will work well enough; and it doesn't add any complexity for file -> access. - -I do not see this connection between a filesystem and a disk ? -Modern systems have the ability to join more than one disk into -one filesystem. - -Also if we think about separating large tables into smaller parts -we imho want something where the optimizer has knowledge -what data it finds in what part of the table. - -Andreas - -From pgsql-hackers-owner+M4680@hub.org Mon Jul 10 11:16:07 2000 -Received: from hub.org (root@hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA28153 - for ; Mon, 10 Jul 2000 10:16:06 -0400 (EDT) -Received: from hub.org (majordom@localhost [127.0.0.1]) - by hub.org (8.10.1/8.10.1) with SMTP id e6AEG5W83419; - Mon, 10 Jul 2000 10:16:05 -0400 (EDT) -Received: from corvette.mascari.com (dhcp160176144.columbus.rr.com [24.160.176.144]) - by hub.org (8.10.1/8.10.1) with ESMTP id e6AE7FW63372 - for ; Mon, 10 Jul 2000 10:07:24 -0400 (EDT) -Received: from mascari.com (ferrari.mascari.com [192.168.2.1]) - by corvette.mascari.com (8.9.3/8.9.3) with ESMTP id KAA10768; - Mon, 10 Jul 2000 10:03:27 -0400 -Message-ID: <3969D7CA.8AF9573C@mascari.com> -Date: Mon, 10 Jul 2000 10:03:54 -0400 -From: Mike Mascari -Organization: Mascari Development Inc -X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.5-15 i586) -X-Accept-Language: en -MIME-Version: 1.0 -To: Bruce Momjian -CC: Tom Lane , Philip Warner , - Chris Bitmead , - "pgsql-hackers@postgreSQL.org" -Subject: Re: [HACKERS] Re: [GENERAL] PostgreSQL vs. MySQL -References: <200007101310.JAA26260@candle.pha.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -X-Mailing-List: pgsql-hackers@postgresql.org -Precedence: bulk -Sender: pgsql-hackers-owner@hub.org -Status: ROr - -Bruce Momjian wrote: -> -> > And of course the major problem with *that* is how do you get the -> > connection request to arrive at a backend that's been prestarted in -> > the right database? If you don't commit to a database then there's -> > not a whole lot of prestarting that can be done. -> > -> > It occurs to me that this'd get a whole lot more feasible if one -> > postmaster == one database, which is something we *could* do if we -> > implemented schemas. Hiroshi's been arguing that the current hard -> > separation between databases in an installation should be done away -> > with in favor of schemas, and I'm starting to see his point... -> -> This is interesting. You believe schema's would allow a pool of -> backends to connect to any database? That would clearly be a win. - -I'm just curious, but did a consensus ever develop on schemas? It -seemed that the schemas/tablespace thread just ran out of steam. -For what its worth, I like the idea of: - -1. PostgreSQL installation -> SQL cluster of catalogs -2. PostgreSQL database -> SQL catalog -3. PostgreSQL schema -> SQL schema - -This correlates nicely with the current representation of -DATABASE. People can run multiple SQL clusters by running -multiple postmasters on different ports. Today, most people -achieve a logical separation of data by issuing multiple CREATE -DATABASE commands. But under the above, most sites would run with -a single PostgreSQL database (SQL catalog), since: - -"Catalogs are named collections of schemas in an SQL-environment" - -This would mirror the behavior of Oracle, where most people run -with a single Oracle SID. The logical separation would be -achieved with SCHEMA's a level under the current DATABASE (a.k.a. -catalog). This eliminates the problem of using softlinks and -creating various subdirectories to mirror *logical* parititioning -of data. It also alleviates the problem people currently -encounter when they've built their data model around multiple -DATABASE's but learn later that they need access to more than one -simultaneously. Instead, they'll model their design around -multiple SCHEMA's which exist within a single DATABASE instance. - -It seems that the discussion of tablespaces shouldn't be mixed -with SCHEMA's except to note that a DATABASE (catalog) should -have a default TABLESPACE whose path matches the current one: - -../pgsql/data/base/ - -Later, users might be able to create a hierarchy of default -TABLESPACE's where the location of the object is found with logic -like: - -1. Is there a object-specified tablespace? - (ex: CREATE TABLE payroll IN TABLESPACE...) -2. Is there a user-specified default tablespace? - (ex: CREATE USER mike DEFAULT TABLESPACE...) -2. Is there a schema-specified default tablespace? - (ex: CREATE SCHEMA accounting DEFAULT TABLESPACE..) -3. Use the catalog-default tablespace - (ex: CREATE DATABASE postgres DEFAULT LOCATION '/home/pgsql') - -with the last example creating the system tablespace, -'system_tablespace', with '/home/pgsql' as the location. - -Anyways, it seems a consensus should be developed on the whole -Cluster/Catalog/Schema scenario. - -Mike Mascari - -From Albert.Langer@Directory-Designs.org Sun Apr 15 12:57:07 2001 -Received: from relay1.pair.com (relay1.pair.com [209.68.1.20]) - by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id MAA22644 - for ; Sun, 15 Apr 2001 12:57:06 -0400 (EDT) -Received: (qmail 16730 invoked from network); 15 Apr 2001 16:56:26 -0000 -Received: from cpe-144-132-70-18.vic.bigpond.net.au (HELO w98) (144.132.70.18) - by relay1.pair.com with SMTP; 15 Apr 2001 16:56:26 -0000 -X-pair-Authenticated: 144.132.70.18 -Reply-To: -From: "Albert Langer" -To: "'Bruce Momjian'" , - "'Hiroshi Inoue'" , - "'Ross J. Reedstrom'" , - "'Mike Mascari'" , , - "'Tom Lane'" , - "'Zeugswetter Andreas SB'" , - "'The Hermit Hacker'" , - "'Oliver Elphick'" , - "'Don Baccus'" , - "'Thomas Lockhart'" , - "'Chris Bitmead'" , - "'Philip J. Warner'" , - "'Peter Eisentraut'" , - "'Lamar Owen'" , - "'Vadim Mikheev'" -Subject: Tablespaces - checkout SAP DB -Date: Mon, 16 Apr 2001 02:56:04 +1000 -Message-ID: <000001c0c5cc$f5fd6ac0$6628a8c0@nowhere.com> -MIME-Version: 1.0 -Content-Type: text/plain; - charset="iso-8859-1" -Content-Transfer-Encoding: 7bit -X-Priority: 3 (Normal) -X-MSMail-Priority: Normal -X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2910.0) -X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 -Importance: Normal -Status: RO - -Hi everyone, - -Sorry about the long To list - this is to everyone I noticed commenting in: -http://www.postgresql.org/docs/pgsql/doc/TODO.detail/tablespaces - -I strongly recommend checkout of approach used in SAP DB: - -http://www.sap.com/solutions/technology/sapdb/sap_db_documentation.htm - -Their glossy 2 page brochure emphasizes the way they handle -tablespaces as strongest point for ease of administration: - -http://www.sap.com/solutions/technology/sapdb/pdf/50033321.pdf - -Directory distribution explained in: -http://www.sap.com/solutions/technology/sapdb/pdf/directorydistrib_72eng.pdf - -Architecture and tablespace/devspace concepts explained in: - -http://www.sap.com/solutions/technology/sapdb/pdf/dbmgui_73eng.pdf -(721K) - -A good short overview can be obtained from the Glossary: - -http://www.sap.com/solutions/technology/sapdb/sap_db_glossary.htm -(not .pdf - ordinary html) - -vvvvvvv -data devspace - -The user data (tables, indexes) and the SQL catalog are stored in the data -devspaces. A table or an index needs one page (minimum); a table can use all -the data devspaces that is the whole database (maximum). A table increases -or decreases in size automatically without administrative intervention. - -As a rule, a database internal striping algorithm distributes the data -belonging to a table evenly across all the data devspaces. An assignment of -tables to data devspaces is not possible nor is it necessary. - -When installing the database instance you can configure one or more data -devspaces and while the database is running you can also add new data -devspaces. The disk storage space defined by all the data devspaces is the -total size of the database. - -devspace - -This term denotes a physical disk or part of a physical disk. This can be a -raw device or a file. - -log devspace - -What is recorded in a log devspace is all the changes in the contents of the -database, to enable the contents to be recovered or restored after hardware -faults. The complete log can consist of a number of devspaces. You can -define the number of log devspaces required when installing the database -instance and can add new log devspaces even while the database is operating. -To ensure that the data on the database is kept safe, you have the option of -mirroring the log devspace(s) (set parameter LOG_MODE to DUAL). - -In log backups the contents of the log devspace(s) is copied to a file and -the space originally occupied by it is released for log data. The backup -files are numbered by the system in sequence. The selected size of the -archive log devspace should therefore be sufficient for all the changes -occurring between two backups to be recorded there. - -serverdb - -A Serverdb consists of the system devspace, one or more log devspaces, and -one or more data devspaces. - -For security and performance reasons, each devspace type should be kept on a -different disk. The log devspaces of a serverdb can also be mirrored to -obtain a higher degree of availability. The disks used should present -uniform performance data (especially access speeds) because this is the only -way that equal usage of the devspaces can be achieved. If necessary, a -database instance can be expanded by additional data devspaces while the -database is running. - -The devspace usage level of a database instance is therefore a critical -parameter of database operation and must be monitored. If the data devspaces -become full, database operation stops. Further data devspaces can be defined -in this state to allow database operation to continue. - -system devspace - -The restart information and the mapping of the logical page numbers to -physical page addresses are administered in the system devspace. The size of -the system devspace therefore depends directly on the database size and is -determined by the database kernel. -^^^^^^^^^^^^^^^^^ - -Concept of just flexibly assigning space to databases, -with only two types of space that should be kept on -different spindlesets, plus the ability to add space -*while running* is what justifies their claim to much -easier admin than Oracle. - -Many Postgresql sites run with far too few spindles anyway -and don't have DBAs with a clue what to do with tablespaces. -Now that SAP DB is also open source, making it easy for them -could be critically important. - -I'm not even subscribed to pgsql-hacker and don't understand -the internals enough to have any view on whether it's possible -or how. - -But if it is possible to present similar *concepts* to DBAs -from the "outside", with whatever actually goes on internally, -that would be really *great*. - -Once the internals are done, others could more easily add -admin tools and documentation comparable to SAP DB. Given -the overwhelming advantages of PostgreSQL from all other -points of view, this could be critically important. - -I was surprised to find no discussion of comparisons with -SAP DB and what could be learned from it's source release -in a quick search of the web site and mailing lists. - -Seeya, Albert - - -From pgsql-general-owner+M14288@postgresql.org Mon Aug 27 10:31:19 2001 -Return-path: -Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238]) - by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f7REVIF27112 - for ; Mon, 27 Aug 2001 10:31:18 -0400 (EDT) -Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28]) - by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f7REVkq86991; - Mon, 27 Aug 2001 09:31:47 -0500 (CDT) - (envelope-from pgsql-general-owner+M14288@postgresql.org) -Received: from svana.org (svana.org [210.9.66.30]) - by postgresql.org (8.11.3/8.11.4) with ESMTP id f7RDcEf82291 - for ; Mon, 27 Aug 2001 09:38:15 -0400 (EDT) - (envelope-from kleptog@svana.org) -Received: from kleptog by svana.org with local (Exim 3.12 #1 (Debian)) - id 15bMal-0000Ac-00; Mon, 27 Aug 2001 23:38:15 +1000 -Date: Mon, 27 Aug 2001 23:38:15 +1000 -From: Martijn van Oosterhout -To: newsreader@mediaone.net -cc: Jeff Davis , pgsql-general@postgresql.org -Subject: Re: [GENERAL] raw partition -Message-ID: <20010827233815.B32309@svana.org> -Reply-To: Martijn van Oosterhout -References: <20010826125450.A11535@dragon.universe> <0GIP004VTV1MTO@mta7.pltn13.pbi.net> <20010827091141.A3208@dragon.universe> -MIME-Version: 1.0 -Content-Type: text/plain; charset=us-ascii -Content-Disposition: inline -User-Agent: Mutt/1.2.5i -In-Reply-To: <20010827091141.A3208@dragon.universe>; from newsreader@mediaone.net on Mon, Aug 27, 2001 at 09:11:41AM -0400 -Precedence: bulk -Sender: pgsql-general-owner@postgresql.org -Status: OR - -On Mon, Aug 27, 2001 at 09:11:41AM -0400, newsreader@mediaone.net wrote: -> On Mon, Aug 27, 2001 at 12:46:16AM -0700, Jeff Davis wrote: -> > On Sunday 26 August 2001 09:54 am, you wrote: -> > -> > Obviously, if done properly, it couldn't hurt. However, is it really worth -> > the extra trouble to set it up, and more so, to debug an extra form that disk -> -> I think it's only a matter of getting rid -> of file system layer. - -But that won't work. Postgres currently stores each table in its own file. -Thus, to implement raw access postgres would have to implement it's own -filesystem within the raw partition. - -By using the filesystems built into the OS, it can take advantage of -filesystem smarts already there. No to mention people just being able to use -normal system commands to view what's there e.g. symlinks to relocate -tables. I beleive that filesystem technology within the OS will advance much -faster than anything the postgres developers could come up with. - -For example, by running your database on an ext3 partition, all file -metadata is automatically journalled, with no additional effort from the -postgres developers. You could even choose to journal all database access -(though I have no idea how that interacts with WAL). - -> > marginal utility for integrated functionality? Consider this: should postgres -> > be it's own OS; bootable and everything (get rid of all that OS overhead)? I -> -> file system overhead is all, I think. -> The only thing I am sure about is that -> whether pg (and developers) will have to be -> aware of the disk technology since it is -> evolving continuosly. Or is there another -> layer provided by the OS: a layer -> between physical disk and the filesystem? -> That layer will have to understand UDMA technology, -> SCSI technology? I have no idea. - -Well, a raw partition provided by the OS would hide such details. However, -postgres would have to make assumptions about what kind of access patterns -are optimal. The kernel is in a much better position to make such decisions -about resource usage. Which is precisly why we have OS's in the first place. - -> > oracle allows this behaviour you speak of, but I have never used it. Does -> > someone have experience (or benchmarks or whatever) with oracle's -> > implementation? -> -> I have never used an oracle - -I beleive (someone correct me if I'm wrong) that even when used on a -filesystem, oracle still places all it's tables in a single file i.e. it has -a filesystem layer builtin. I think that's why it's a clear win for oracle -because you *are* actually removing a layer. - -IMHO it's something postgres should stay well away from. --- -Martijn van Oosterhout -http://svana.org/kleptog/ -> It would be nice if someone came up with a certification system that -> actually separated those who can barely regurgitate what they crammed over -> the last few weeks from those who command secret ninja networking powers. - ----------------------------(end of broadcast)--------------------------- -TIP 2: you can get off all lists at once with the unregister command - (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) - -From jim@buttafuoco.net Sun Mar 3 14:34:59 2002 -Return-path: -Received: from dual.buttafuoco.net (vsat-148-63-214-126.c004.g4.mrt.starband.net [148.63.214.126]) - by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g23KYjM24547 - for ; Sun, 3 Mar 2002 15:34:52 -0500 (EST) -Received: from buttafuoco.net (dual [127.0.0.1]) - by dual.buttafuoco.net (8.11.2/8.11.2) with ESMTP id g23KYaF05729; - Sun, 3 Mar 2002 15:34:36 -0500 -From: "Jim Buttafuoco" -To: Bruce Momjian , jim@buttafuoco.net -cc: Vadim Mikheev , - pgsql-hackers -Reply-To: jim@buttafuoco.net -Subject: Re: [HACKERS] Status of index location patch -Date: Sun, 3 Mar 2002 15:34:36 -0500 -Message-ID: <20020303153436.M48726@buttafuoco.net> -In-Reply-To: <200202221805.g1MI5o429319@candle.pha.pa.us> -References: <200109151754.f8FHsdB08189@dual.buttafuoco.net> <200202221805.g1MI5o429319@candle.pha.pa.us> -X-Mailer: Open WebMail 1.62 20020220 -X-OriginatingIP: 192.1.3.22 (jim) -MIME-Version: 1.0 -Content-Type: text/plain; charset=iso-8859-1 -Status: ORr - -Bruce, - -I stopped all work on this since people seemed confused about the -tablespace/location words. I don't think enough of the "core" team likes -this idea. Am I wrong here? Did I explain the patch good enough? - -Please let me know, I still am planning on doing it for internal use. I -would prefer that it was a standard feature. If you think I should still -pursue this, let me know what I need to do to get it off the ground. - -Thanks for your help -Jim - - - -> Jim, do you have an updated patch that you would like applied for 7.3? -> -> --------------------------------------------------------------------------- -> -> Jim Buttafuoco wrote: -> > Vadim, -> > -> > I guess I am still confused... -> > -> > In dbcommands.c resolve_alt_dbpath() takes the db oid as a argument. -> > This number is used to "find" the directory where the data files live. -> > All the patch does is put the indexes into a "db oid"_index directory -> > instead of "db oid" -> > -> > -> > This is for tables snprintf(ret, len, "%s/base/%u", prefix, dboid); -> > This is for indexes snprintf(ret, len, "%s/base/%u_index", prefix, -> > dboid); -> > -> > And in catalog.c -> > tables: sprintf(path, "%s/base/%u/%u", DataDir, rnode.tblNode, -> > rnode.relNode); -> > indexes: sprintf(path, "%s/base/%u_index/%u", DataDir, -> > rnode.tblNode,rnode.relNode); -> > -> > Can you explain how I would get the tblNode for an existing database -> > index files if it doesn't have the same OID as the database entry in -> > pg_databases. -> > -> > Jim -> > -> > -> > > > Just wondering what is the status of this patch. Is seems from -> > comments -> > > > that people like the idea. I have also looked in the archives for -> > other -> > > > people looking for this kind of feature and have found alot of -> > interest. -> > > > -> > > > If you think it is a good idea for 7.2, let me know what needs to be -> > > > changed and I will work on it this weekend. -> > > -> > > Just change index' dir naming as was already discussed. -> > > -> > > Vadim -> > > -> > > -> > -> > -> > -> > ---------------------------(end of broadcast)--------------------------- -> > TIP 2: you can get off all lists at once with the unregister command -> > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) -> > -> -> -- -> Bruce Momjian | http://candle.pha.pa.us -> pgman@candle.pha.pa.us | (610) 853-3000 -> + If your life is a hard drive, | 830 Blythe Avenue -> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 - - - - -From lockhart@fourpalms.org Tue Mar 5 08:02:50 2002 -Return-path: -Received: from golem.fourpalms.org (www.fourpalms.org [64.3.68.148]) - by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g25E2oY04958 - for ; Tue, 5 Mar 2002 09:02:50 -0500 (EST) -Received: from fourpalms.org (localhost.localdomain [127.0.0.1]) - by golem.fourpalms.org (Postfix) with ESMTP - id CACDD1BC83; Tue, 5 Mar 2002 06:02:47 -0800 (PST) -Sender: lockhart@fourpalms.org -Message-ID: <3C84D007.46FB30B6@fourpalms.org> -Date: Tue, 05 Mar 2002 06:02:47 -0800 -From: Thomas Lockhart -Reply-To: lockhart@fourpalms.org -Organization: Yes -X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.8-34.1mdksmp i686) -X-Accept-Language: en -MIME-Version: 1.0 -To: Tom Lane -cc: Bruce Momjian , jim@buttafuoco.net, - pgsql-hackers -Subject: Re: Storage Location Patch Proposal for V7.3 -References: <200203050631.g256Vh924330@candle.pha.pa.us> <13961.1015313407@sss.pgh.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: OR - -... -> Forward compatibility to a future tablespace implementation. -> If we do this, we'll be stuck with supporting this feature set, -> not to mention this syntax; neither of which have garnered any -> support from the assembled hackers. - -The feature set (in some incarnation) is exactly something we should -have. "Tablespace" could mean almost anything, since (I recall that) we -are not slavishly copying the Oracle features having a similar name. The -syntax (or something similar) seems acceptable to me. I haven't looked -at the implementation itself. - -So, I'll guess that the particular objection to this implementation is -along the lines of wanting to be able to manage tablespaces/locations as -a single entity? So that one could issue commands like (forgive the -syntax) "move tablespace xxx to yyy;" and be able to yank the entire -contents from one place to another in a single line? - -Jim's patches don't explicitly tie the pieces residing in a single -location together. Is that the objection? In all other respects (and -perhaps in all respects period) it seems to be a good starting point at -least. - -I know that you have said that you want to look at "tablespaces" for -7.3. If we get there with a feature set we all find acceptable, then -great. If we don't, then Jim's subset of features would be great to -have. - -Comments? - - - Thomas - -From pgsql-hackers-owner+M19763@postgresql.org Wed Mar 6 19:50:47 2002 -Return-path: -Received: from postgresql.org (postgresql.org [64.49.215.8]) - by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g271okY15943 - for ; Wed, 6 Mar 2002 20:50:46 -0500 (EST) -Received: from postgresql.org (postgresql.org [64.49.215.8]) - by postgresql.org (Postfix) with SMTP - id 220A3475B48; Wed, 6 Mar 2002 20:49:59 -0500 (EST) -Received: from dual.buttafuoco.net (vsat-148-63-214-126.c004.g4.mrt.starband.net [148.63.214.126]) - by postgresql.org (Postfix) with ESMTP id 4D925475881 - for ; Wed, 6 Mar 2002 20:44:51 -0500 (EST) -Received: from buttafuoco.net (dual [127.0.0.1]) - by dual.buttafuoco.net (8.11.2/8.11.2) with ESMTP id g271ihm25853 - for ; Wed, 6 Mar 2002 20:44:43 -0500 -From: "Jim Buttafuoco" -To: "pgsql-hackers" -Reply-To: jim@buttafuoco.net -Subject: [HACKERS] Storage Location / Tablespaces (try 3) -Date: Wed, 6 Mar 2002 20:44:43 -0500 -Message-ID: <20020306204443.M82891@buttafuoco.net> -X-Mailer: Open WebMail 1.62 20020220 -X-OriginatingIP: 192.1.3.22 (jim) -MIME-Version: 1.0 -Content-Type: text/plain; charset=iso-8859-1 -Precedence: bulk -Sender: pgsql-hackers-owner@postgresql.org -Status: OR - -Me again, I have some more details on my storage location patch - - - -This patch would allow the system admin (DBA) to specify the location of -databases, tables/indexes and temporary objects (temp tables and temp sort -space) independent of the database/system default location. This patch would -replace the current "LOCATION" code. - -Please let me know if you have any questions/comments. I would like to see -this feature make 7.3. I believe it will take about 1 month of coding and -testing after I get started. - -Thanks -Jim - -============================================================================== -Storage Location Patch (Try 3) - - -(If people like TABLESPACE instead of LOCATION then s/LOCATION/TABLESPACE/g -below) - - -This patch would add the following NEW commands ----------------------------------------------------- - CREATE LOCATION name PATH 'dbpath'; - DROP LOCATION name; - -where dbpath is any directory that the postgresql backend can write to. -(I know this is how Oracle works, don't know about the other major db systems) - -The following NEW GLOBAL system table would be added. ------------------------------------------------------ -PG_LOCATION -( - LOC_NAME name, - LOC_PATH text -- This should be able to take any path name. -); -(initdb would add (PGDATA,'/usr/local/pgsql/data') - -The following system tables would need to be modified ------------------------------------------------------ -PG_DATABASE drop datpath - add DATA_LOC_NAME name or DATA_LOC_OID OID - add INDEX_LOC_NAME name or INDEX_LOC_OID OID - add TEMP_LOC_NAME name or TEMP_LOC_OID OID -PG_CLASS to add LOC_NAME name or LOC_OID OID - -DATA_LOC_* and INDEX_LOC_* would default to PGDATA if not specified. - -(I like *LOC_NAME better but I believe the rest of the systems tables use OID) - - -The following command syntax would be modified ------------------------------------------------------- -CREATE DATABASE WITH DATA_LOCATION=XXX INDEX_LOCATION=YYY TEMP_LOCATION=ZZZ -CREATE TABLE aaa (...) WITH LOCATION=XXX; -CREATE TABLE bbb (c1 text primary key location CCC) WITH LOCATION=XXX; -CREATE TABLE ccc (c2 text unique location CCC) WITH LOCATION=XXX; -CREATE INDEX XXX on SAMPLE (C2) WITH LOCATION BBB; - - - -Now for an example ------------------------------------------------------- -First: - postgresql is installed at /usr/local/pgsql - userid postgres - the postgres user also is the owner of /pg01 /pg02 /pg03 - -the dba executes the following script -CREATE LOCATION pg01 PATH '/pg01'; -CREATE LOCATION pg02 PATH '/pg02'; -CREATE LOCATION pg03 PATH '/pg03'; -CREATE LOCATION bigdata PATH '/bigdata'; -CREATE LOCATION bigidx PATH '/bigidx'; -\q - -PG_LOCATION now has -pg01 | /pg01 -pg02 | /pg02 -pg03 | /pg03 -bigdata | /bigdata -bigidx | /bigidx - -Now the following command is run -CREATE DATABASE jim1 WITH DATA_LOCATION='pg01' INDEX_LOCATION='pg02' -TEMP_LOCATION='pg03' --- OID of 'jim1' tuple is 1786146 - -on disk the directories look like this -/pg01/1786146 <<-- Default DATA Location -/pg02/1786146 <<-- Default INDEX Location -/pg03/1786146 <<-- Default Temp Location - -All files from the above directories will have symbolic links to -/usr/local/pgsql/data/base/1786146/ - - - -Now the system will have 1 BIG table that will get its own disk for data and -its own disk for index -create table big (a text,b text ..., primary key (a,b) location 'bigidx'); - -oid of big table is 1786150 -oid of big table primary key index is 1786151 - -on disk directories look like this -/bigdata/1786146/1786150 -/bigidx/1786146/1786151 -/usr/local/pgsql/data/base/1786146/1786150 symbolic link to -/bigdata/1786146/1786150 -/usr/local/pgsql/data/base/1786146/1786151 symbolic link to -/bigdata/1786146/1786151 - - - -The symbolic links will enable the rest of the software to be location -independent. - - - ----------------------------(end of broadcast)--------------------------- -TIP 2: you can get off all lists at once with the unregister command - (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) - -From pgsql-hackers-owner+M19814@postgresql.org Thu Mar 7 17:25:06 2002 -Return-path: -Received: from postgresql.org (postgresql.org [64.49.215.8]) - by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g27NP4Q10967 - for ; Thu, 7 Mar 2002 18:25:05 -0500 (EST) -Received: from postgresql.org (postgresql.org [64.49.215.8]) - by postgresql.org (Postfix) with SMTP - id 74CC94761DE; Thu, 7 Mar 2002 17:50:44 -0500 (EST) -Received: from sss.pgh.pa.us (unknown [192.204.191.242]) - by postgresql.org (Postfix) with ESMTP id 712F0476101 - for ; Thu, 7 Mar 2002 17:47:04 -0500 (EST) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g27MkaS15710; - Thu, 7 Mar 2002 17:46:41 -0500 (EST) -To: jim@buttafuoco.net -cc: "Zeugswetter Andreas SB SD" , - "pgsql-hackers" -Subject: Re: [HACKERS] Storage Location / Tablespaces (try 3) -In-Reply-To: <20020307160519.M90856@buttafuoco.net> -References: <46C15C39FEB2C44BA555E356FBCD6FA4961D67@m0114.s-mxs.net> <20020307160519.M90856@buttafuoco.net> -Comments: In-reply-to "Jim Buttafuoco" - message dated "Thu, 07 Mar 2002 16:05:19 -0500" -Date: Thu, 07 Mar 2002 17:46:36 -0500 -Message-ID: <15707.1015541196@sss.pgh.pa.us> -From: Tom Lane -Precedence: bulk -Sender: pgsql-hackers-owner@postgresql.org -Status: OR - -"Jim Buttafuoco" writes: -> My first try passed the tablespace OID arround but someone pointed out the the -> WAL code doesn't know what the tablespace OID is or what it's location is. - -The low-level file access code (including WAL references) names tables -by two OIDs, which currently are database OID and relfilenode (the -latter is NOT to be considered equivalent to table OID, even though it -presently always is equal). - -I believe that the correct implementation approach is to revise things -so that the low-level name of a table is tablespace OID + relfilenode; -this physical table name would in concept be completely distinct from -the logical table identification (database OID + table OID). The file -reference path would become something like -"$PGDATA/base/tablespaceoid/relfilenode", where tablespaceoid might -reference a symlink to a directory instead of a plain directory. -Tablespace management then consists of setting up those symlinks -correctly, and there is essentially zero impact on the low-level access -code. - -The hard part of this is that we are probably being sloppy in some -places about the difference between physical and logical table -identifications. Those places will need to be found and fixed. -This needs to happen anyway, of course, since the point of introducing -relfilenode was to allow table versioning, which we still want. - -Vadim suggested long ago that bufmgr, smgr, and below should have -nothing to do with referencing files by relcache entries; they should -only deal in physical file identifiers. That requires some tedious but -(in principle) straightforward API changes. - -BTW, if tablespaces can be shared by databases then DROP DATABASE -becomes rather tricky: how do you zap the correct files out of a shared -tablespace, keeping in mind that you are not logged into the doomed -database and can't look at its catalogs? The best idea I've seen for -this so far is: - -1. Access path for tables is really - $PGDATA/base/databaseoid/tablespaceoid/relfilenode. -(BTW, we could save some work if we chdir'd into -$PGDATA/base/databaseoid at backend start and then used only relative -tablespaceoid/relfilenode paths. Right now we tend to use absolute -paths because the bootstrap code doesn't do that chdir; which seems -like a stupid solution...) - -2. A shared tablespace directory contains a subdirectory for each database -that has files in the tablespace. Thus, the actual filesystem location -of a table is something like - /databaseoid/relfilenode -The symlink from a database's $PGDATA/base/databaseoid/ directory to -the tablespace points at /databaseoid. The first attempt to -create a table in a tablespace from a particular database will create -the hard subdirectory and set up the symlink; or perhaps that should be -done by an explicit tablespace management operation to "connect" the -database to the tablespace. - -3. To drop a database, we examine the symlinks in its -$PGDATA/base/databaseoid/ and rm -rf each referenced tablespace -subdirectory before rm -rf'ing $PGDATA/base/databaseoid. - - regards, tom lane - ----------------------------(end of broadcast)--------------------------- -TIP 2: you can get off all lists at once with the unregister command - (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) - -From pgsql-general-owner+M22554=candle.pha.pa.us=pgman@postgresql.org Mon Mar 25 01:56:17 2002 -Return-path: -Received: from postgresql.org (postgresql.org [64.49.215.8]) - by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g2P7uGa20556 - for ; Mon, 25 Mar 2002 02:56:16 -0500 (EST) -Received: from postgresql.org (postgresql.org [64.49.215.8]) - by postgresql.org (Postfix) with SMTP id D28B2475B61 - for ; Mon, 25 Mar 2002 02:56:17 -0500 (EST) -Received: from sss.pgh.pa.us (unknown [192.204.191.242]) - by postgresql.org (Postfix) with ESMTP id EB3244758E9 - for ; Mon, 25 Mar 2002 02:55:54 -0500 (EST) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g2P7toS17527; - Mon, 25 Mar 2002 02:55:50 -0500 (EST) -To: Bruce Momjian -cc: Richard Emberson , pgsql-general@postgresql.org -Subject: Re: [GENERAL] Large Object Location in 7.3 -In-Reply-To: <200203241932.g2OJWGV00796@candle.pha.pa.us> -References: <200203241932.g2OJWGV00796@candle.pha.pa.us> -Comments: In-reply-to Bruce Momjian - message dated "Sun, 24 Mar 2002 14:32:16 -0500" -Date: Mon, 25 Mar 2002 02:55:50 -0500 -Message-ID: <17524.1017042950@sss.pgh.pa.us> -From: Tom Lane -Precedence: bulk -Sender: pgsql-general-owner@postgresql.org -Status: OR - -Bruce Momjian writes: -> Richard Emberson wrote: ->> I expect (actually hope) to have thousands and thousands of blob/clobs ->> in the db I am designing. ->> I would like such largeobjects to be stored in their own file system. - -> Sure, find the oid of pg_largeobject and symlink that to another file -> system. You need to do that toast table and any indexes for the table -> too. - -If Richard's envisioning more than 1GB of large objects, I don't think -he's going to be very satisfied with manual symlinking. - -This does bring up an interesting point: the tablespace schemes we've -discussed so far don't allow system catalogs to be moved out of the -default tablespace for a database. That doesn't bother me for most -of the system catalogs ... but pg_largeobject seems like it might be -an exception. - - regards, tom lane - ----------------------------(end of broadcast)--------------------------- -TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org - -From pgsql-hackers-owner+M38980=pgman=candle.pha.pa.us@postgresql.org Mon May 19 19:04:15 2003 -Return-path: -Received: from relay2.pgsql.com (host-64-117-225-159.altec1.com [64.117.225.159] (may be forged)) - by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id h4JN4Dv08477 - for ; Mon, 19 May 2003 19:04:14 -0400 (EDT) -Received: from postgresql.org (unknown [64.117.224.193]) - by relay2.pgsql.com (Postfix) with ESMTP id 3EC59FA439 - for ; Mon, 19 May 2003 18:39:59 -0400 (EDT) -X-Original-To: pgsql-hackers@postgresql.org -Received: from localhost (unknown [64.117.224.193]) - by developer.postgresql.org (Postfix) with ESMTP id 7BFFA92617A - for ; Mon, 19 May 2003 18:39:07 -0400 (EDT) -Received: from developer.postgresql.org ([64.117.224.193]) - by localhost (developer.postgresql.org [64.117.224.193:10024]) (amavisd-new) - with ESMTP id 47742-05 for ; - Mon, 19 May 2003 18:39:01 -0400 (EDT) -Received: from smxsat1.smxs.net (smxsat1.smxs.net [213.150.10.1]) - by developer.postgresql.org (Postfix) with ESMTP id F30679255EE - for ; Mon, 19 May 2003 04:46:14 -0400 (EDT) -Received: from m01x1.s-mxs.net [10.3.55.201] - by smxsat1.smxs.net - over TLS secured channel - with XWall v3.26e ; - Mon, 19 May 2003 10:46:11 +0200 -Received: from m0102.s-mxs.net [10.3.55.2] - by m01x1.s-mxs.net - with XWall v3.26 ; - Mon, 19 May 2003 10:46:10 +0200 -Received: from m0114.s-mxs.net ([10.3.55.14]) by m0102.s-mxs.net with Microsoft SMTPSVC(5.0.2195.5329); - Mon, 19 May 2003 10:46:04 +0200 -content-class: urn:content-classes:message -MIME-Version: 1.0 -Content-Type: text/plain; - charset="us-ascii" -Subject: Re: [HACKERS] Feature suggestions (long) -X-MimeOLE: Produced By Microsoft Exchange V6.0.6434.0 -Date: Mon, 19 May 2003 10:46:04 +0200 -Message-ID: <46C15C39FEB2C44BA555E356FBCD6FA4961FB2@m0114.s-mxs.net> -Thread-Topic: [HACKERS] Feature suggestions (long) -Thread-Index: AcMciTRmS8S5HY34Q62Cd5TpVM44pwBUcmeQ -From: "Zeugswetter Andreas SB SD" -To: "Martijn van Oosterhout" , - -X-OriginalArrivalTime: 19 May 2003 08:46:04.0672 (UTC) FILETIME=[152F7800:01C31DE3] -X-Virus-Scanned: by amavisd-new -Precedence: bulk -Sender: pgsql-hackers-owner@postgresql.org -Content-Transfer-Encoding: 8bit -X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id h4JN4Dv08477 -Status: OR - -> Partitions -> ========== - -> Next stage would be teaching the planner. The conditions would be -> pseudo-constraints on the partitions. Hence if the conditions and the -> constraints form a non-intersecting set, you can skip that partition -> altogether. - -Make that "normal check constraints", and make the planner consider -constraints, -and I think that by itself combined with the current featureset will -be much more powerful than any of the "partitioning" features out there. -(This is mainly needed to optimize selects on the big union all view) - -Imho if a dba starts to partition, he usually needs to be more involved -than the average user, so I think he should be able cope with compexity. -What imho would help, is a tool that generates a suggested rule set, -indexes -and actions, which the dba can review and apply. I do not think new SQL -syntax -would really help, since that would somehow hide the great existing -power of -the rule system. A tool would teach the dba, and empower him to use it. - -And yes, creating several smaller tables and adding the appropriate -rules -usually makes the VLDB life much easier compared to growing single -tables into -the TB range. - -Andreas - ----------------------------(end of broadcast)--------------------------- -TIP 5: Have you checked our extensive FAQ? - -http://www.postgresql.org/docs/faqs/FAQ.html - -From pgsql-hackers-owner+M39002=pgman=candle.pha.pa.us@postgresql.org Tue May 20 11:08:18 2003 -Return-path: -Received: from relay3.pgsql.com (relay3.pgsql.com [64.117.224.149]) - by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id h4KF8Br13455 - for ; Tue, 20 May 2003 11:08:16 -0400 (EDT) -Received: from postgresql.org (unknown [64.117.224.193]) - by relay3.pgsql.com (Postfix) with ESMTP id 6754111262F2 - for ; Tue, 20 May 2003 15:08:07 +0000 (GMT) -X-Original-To: pgsql-hackers@postgresql.org -Received: from localhost (unknown [64.117.224.193]) - by developer.postgresql.org (Postfix) with ESMTP id 5119A924FA2 - for ; Tue, 20 May 2003 11:02:38 -0400 (EDT) -Received: from developer.postgresql.org ([64.117.224.193]) - by localhost (developer.postgresql.org [64.117.224.193:10024]) (amavisd-new) - with ESMTP id 79611-01 for ; - Tue, 20 May 2003 11:02:34 -0400 (EDT) -Received: from flake.decibel.org (flake.decibel.org [66.143.173.58]) - by developer.postgresql.org (Postfix) with SMTP id C9F22924FA0 - for ; Tue, 20 May 2003 11:02:29 -0400 (EDT) -Received: (qmail 20461 invoked by uid 1001); 20 May 2003 15:02:24 -0000 -Date: Tue, 20 May 2003 10:02:24 -0500 -From: "Jim C. Nasby" -To: Martijn van Oosterhout -cc: Zeugswetter Andreas SB SD , - pgsql-hackers@postgresql.org -Subject: Re: [HACKERS] Feature suggestions (long) -Message-ID: <20030520150224.GL40542@flake.decibel.org> -Reply-To: jim@nasby.net -References: <46C15C39FEB2C44BA555E356FBCD6FA4961FB4@m0114.s-mxs.net> <20030519144000.GG24653@svana.org> -MIME-Version: 1.0 -Content-Type: text/plain; charset=us-ascii -Content-Disposition: inline -In-Reply-To: <20030519144000.GG24653@svana.org> -User-Agent: Mutt/1.4.1i -X-Operating-System: FreeBSD 4.8-RELEASE i386 -X-Distributed: Join the Effort! http://www.distributed.net -X-Virus-Scanned: by amavisd-new -Precedence: bulk -Sender: pgsql-hackers-owner@postgresql.org -Status: OR - -On Tue, May 20, 2003 at 12:40:00AM +1000, Martijn van Oosterhout wrote: -> Anyway, the general trend seems to be against the idea so I may as well go -> think of something else :) - -I'm disappointed to hear that. Having no way to effectively partition -data is a real pain in pgsql, and your proposal would adress that. Yes, -you can build it yourself by creating the view and all the rules by -hand, but that has a lot of drawbacks: - -It's completely PGSQL specific -It leaves no possibility for performance improvements down the road -It's a lot of code to write -You have to manually maintain it all every time you need to add a new -partition (in your example, at the start of every year). - -I don't know what the policies for patches are, but I'd hope that the -core team would consider adding this functionality, especially since a -first-round implimentation can be done entirely with rules (or so it -seems). - -I certainly understand that development time is a very limited resource, -and I'm willing to work on this (though I'm not a C coder). Even if no -one can commit to this right now, can't it be added to the todo list? --- -Jim C. Nasby (aka Decibel!) jim@nasby.net -Member: Triangle Fraternity, Sports Car Club of America -Give your computer some brain candy! www.distributed.net Team #1828 - -Windows: "Where do you want to go today?" -Linux: "Where do you want to go tomorrow?" -FreeBSD: "Are you guys coming, or what?" - ----------------------------(end of broadcast)--------------------------- -TIP 2: you can get off all lists at once with the unregister command - (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) - -From pgsql-hackers-owner+M39006=pgman=candle.pha.pa.us@postgresql.org Tue May 20 11:30:11 2003 -Return-path: -Received: from relay3.pgsql.com (relay3.pgsql.com [64.117.224.149]) - by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id h4KFU5r15234 - for ; Tue, 20 May 2003 11:30:06 -0400 (EDT) -Received: from postgresql.org (unknown [64.117.224.193]) - by relay3.pgsql.com (Postfix) with ESMTP id 1A18211260D9 - for ; Tue, 20 May 2003 15:30:02 +0000 (GMT) -X-Original-To: pgsql-hackers@postgresql.org -Received: from localhost (unknown [64.117.224.193]) - by developer.postgresql.org (Postfix) with ESMTP id 80D32925003 - for ; Tue, 20 May 2003 11:24:16 -0400 (EDT) -Received: from developer.postgresql.org ([64.117.224.193]) - by localhost (developer.postgresql.org [64.117.224.193:10024]) (amavisd-new) - with ESMTP id 85756-06 for ; - Tue, 20 May 2003 11:24:13 -0400 (EDT) -Received: from svana.org (svana.org [203.20.62.76]) - by developer.postgresql.org (Postfix) with ESMTP id 532D6923324 - for ; Tue, 20 May 2003 11:24:11 -0400 (EDT) -Received: from kleptog by svana.org with local (Exim 3.35 #1 (Debian)) - id 19I8xs-0001rc-00; Wed, 21 May 2003 01:23:44 +1000 -Date: Wed, 21 May 2003 01:23:44 +1000 -From: Martijn van Oosterhout -To: "Jim C. Nasby" -cc: Zeugswetter Andreas SB SD , - pgsql-hackers@postgresql.org -Subject: Re: [HACKERS] Feature suggestions (long) -Message-ID: <20030520152344.GH4069@svana.org> -Reply-To: Martijn van Oosterhout -References: <46C15C39FEB2C44BA555E356FBCD6FA4961FB4@m0114.s-mxs.net> <20030519144000.GG24653@svana.org> <20030520150224.GL40542@flake.decibel.org> -MIME-Version: 1.0 -Content-Type: multipart/signed; micalg=pgp-sha1; - protocol="application/pgp-signature"; boundary="ZYOWEO2dMm2Af3e3" -Content-Disposition: inline -In-Reply-To: <20030520150224.GL40542@flake.decibel.org> -User-Agent: Mutt/1.3.28i -X-PGP-Key-ID: Length=1024; ID=0x0DC67BE6 -X-PGP-Key-Fingerprint: 295F A899 A81A 156D B522 48A7 6394 F08A 0DC6 7BE6 -X-PGP-Key-URL: -X-Virus-Scanned: by amavisd-new -Precedence: bulk -Sender: pgsql-hackers-owner@postgresql.org -Status: OR - ---ZYOWEO2dMm2Af3e3 -Content-Type: text/plain; charset=us-ascii -Content-Disposition: inline -Content-Transfer-Encoding: quoted-printable - -On Tue, May 20, 2003 at 10:02:24AM -0500, Jim C. Nasby wrote: -> On Tue, May 20, 2003 at 12:40:00AM +1000, Martijn van Oosterhout wrote: -> > Anyway, the general trend seems to be against the idea so I may as well= - go -> > think of something else :) ->=20 -> I'm disappointed to hear that. Having no way to effectively partition -> data is a real pain in pgsql, and your proposal would adress that. Yes, -> you can build it yourself by creating the view and all the rules by -> hand, but that has a lot of drawbacks: - -I agree, there is a lot of potential here. And I don't beleive it would be -too much work as most of the infrastructure is already there. At this stage -I'm just wondering if it will go on the TODO list. I propose that the -following items be added: - - * Improve the planner to take CHECK constraints into account to prune th= -e plan. - * Allow a single index to index multiple tables (also for inherited PRIM= -ARY KEYS) - * Allow partitioning of table into multiple subtables - -The first two items would be useful in their own right. With them the final -one would be straight forward. I'd be prepared to put some effort into this -if there is some indication it would be accepted. - -> I don't know what the policies for patches are, but I'd hope that the -> core team would consider adding this functionality, especially since a -> first-round implimentation can be done entirely with rules (or so it -> seems). - -Well, I think the policy is 'if you write the code you have a better chance -to have it accepted' :) So, if it's likely to be accepted then we only need -to find someone to code it. Given the other priorities currently I think -waiting for the core team to write it would be futile (unless you can -convince someone like IBM to give the core team money to write it). - -Right now I'd be happy if the anonymous CVS server would talk to me :) - -By the way, has anyone given thought to user-defined storage managers? Apart -from allowing backward compatable table access, you could implement a simple -version of partitioning that doesn't take advantage of planner tricks. - -Have a nice day, ---=20 -Martijn van Oosterhout http://svana.org/kleptog/ -> "the West won the world not by the superiority of its ideas or values or -> religion but rather by its superiority in applying organized violence. -> Westerners often forget this fact, non-Westerners never do." -> - Samuel P. Huntington - ---ZYOWEO2dMm2Af3e3 -Content-Type: application/pgp-signature -Content-Disposition: inline - ------BEGIN PGP SIGNATURE----- -Version: GnuPG v1.0.6 (GNU/Linux) -Comment: For info see http://www.gnupg.org - -iD8DBQE+ykh/Y5Twig3Ge+YRAkC3AKCCHBjQKOnQEvMSHP5fvqKs3aDmSwCglzl+ -AcdlBtS/wZjauiKtyITTbZA= -=2dU9 ------END PGP SIGNATURE----- - ---ZYOWEO2dMm2Af3e3-- - -From pgsql-hackers-owner+M40393=pgman=candle.pha.pa.us@postgresql.org Thu Jun 26 12:52:55 2003 -Return-path: -Received: from www.postgresql.com (www.postgresql.com [64.117.225.209]) - by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id h5QGqqd00584 - for ; Thu, 26 Jun 2003 12:52:54 -0400 (EDT) -Received: from postgresql.org (developer.postgresql.org [64.117.224.193]) - by www.postgresql.com (Postfix) with ESMTP id 926C8CF76B8 - for ; Thu, 26 Jun 2003 13:52:47 -0300 (ADT) -X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org -Received: from localhost (unknown [64.117.224.193]) - by svr1.postgresql.org (Postfix) with ESMTP id E3C3C30FFC3 - for ; Thu, 26 Jun 2003 16:52:10 +0000 (GMT) -Received: from svr1.postgresql.org ([64.117.224.193]) - by localhost (svr1.postgresql.org [64.117.224.193]) (amavisd-new, port 10024) - with ESMTP id 87866-05 - for ; - Thu, 26 Jun 2003 13:52:00 -0300 (ADT) -Received: from sss.pgh.pa.us (unknown [192.204.191.242]) - by svr1.postgresql.org (Postfix) with ESMTP id C933230FFAF - for ; Thu, 26 Jun 2003 13:51:59 -0300 (ADT) -Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) - by sss.pgh.pa.us (8.12.9/8.12.9) with ESMTP id h5QGobQQ026342; - Thu, 26 Jun 2003 12:50:37 -0400 (EDT) -To: nolan@celery.tssi.com -cc: shridhar_daithankar@persistent.co.in (Shridhar Daithankar), - pgsql-hackers@postgresql.org -Subject: Re: [HACKERS] [GENERAL] Physical Database Configuration -In-Reply-To: <20030626162650.2579.qmail@celery.tssi.com> -References: <20030626162650.2579.qmail@celery.tssi.com> -Comments: In-reply-to nolan@celery.tssi.com - message dated "Thu, 26 Jun 2003 11:26:50 -0500" -Date: Thu, 26 Jun 2003 12:50:36 -0400 -Message-ID: <26341.1056646236@sss.pgh.pa.us> -From: Tom Lane -Precedence: bulk -Sender: pgsql-hackers-owner@postgresql.org -Status: OR - -nolan@celery.tssi.com writes: -> Being able to zap a database with one or more 'rm -rf' commands assumes -> that there will be files from just ONE database permitted in any given -> tablespace, and ONLY files from that database. - -I said no such thing. Look at the structure again: - -$PGDATA/base/dboid/...stuff... - -sometablespace/dboid/...stuff... - -othertablespace/dboid/...stuff... - -DROPDB needs to nuke /dboid/ for each tablespace's associated -. The other design simplifies DROPDB at the cost of increased -complexity for every other tablespace management operation, since you'd -need to cope with a symlink in each database for each tablespace. - -Also, this scheme is at least theoretically amenable to a symlink-free -implementation, though I personally don't give a darn whether -tablespaces are supported on Windows and thus wouldn't expend the extra -effort needed to keep track of full paths. I'd want -$PGDATA/tablespaces/tboid to be a symlink to the root of the tablespace -with a given OID, and then the actual pathname used to access a table in -tablespace tboid, database dboid, table filenode rfoid would look like - $PGDATA/tablespaces/tboid/dboid/rfoid -But a Windoze version could in theory keep track of tablespace locations -directly, and replace the first part of this path with the actual -tablespace location. If we put tablespaces under directories then the -facility has zero functionality without symlinks, because you couldn't -actually do anything to segregate stuff within a database across -different devices. - -BTW, we'd probably remove $PGDATA/base in favor of $PGDATA/tablespaces/N -for some fixed-in-advance N that is the system tablespace, and we'd -require all system catalogs to live in this tablespace --- certainly at -least pg_class and its indexes. Otherwise you have circularity problems -in finding the catalogs ... - - regards, tom lane - ----------------------------(end of broadcast)--------------------------- -TIP 2: you can get off all lists at once with the unregister command - (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) -