#!/usr/bin/perl # # My2Pg: MySQL to PostgreSQL dump conversion utility # # (c) 2000,2001 Maxim Rudensky # (c) 2000 Valentine Danilchuk # All right reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. All advertising materials mentioning features or use of this software # must display the following acknowledgement: # This product includes software developed by the Max Rudensky # and its contributors. # 4. Neither the name of the author nor the names of its contributors # may be used to endorse or promote products derived from this software # without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. # # $My2pg: my2pg.pl,v 1.28 2001/12/06 19:32:20 fonin Exp $ # $Id: my2pg.pl,v 1.13 2004/04/19 23:18:12 momjian Exp $ # Custom patch # Revision 1.9 2002/08/22 00:01:39 tgl # Add a bunch of pseudo-types to replace the behavior formerly associated # with OPAQUE, as per recent pghackers discussion. I still want to do some # more work on the 'cstring' pseudo-type, but I'm going to commit the bulk # of the changes now before the tree starts shifting under me ... # # $Log: my2pg.pl,v $ # Revision 1.13 2004/04/19 23:18:12 momjian # Update to my2pg version 1.28, add docs, update URL for newest version. # # Create diff of custom changes Tom made to the utility for CREATE # FUNCTION. # # This will make moving this utility out of CVS easier. # # Revision 1.12 2004/04/19 23:11:49 momjian # Update to my2pg 1.28, from: # # http://www.omnistarinc.com/~fonin/downloads.php#my2pg # # Revision 1.28 2002/11/30 12:03:48 fonin # PostgreSQL does not support indexes on the partial length of column, # e.g. # CREATE INDEX i_index ON table (column(16)); # will not work. Fixed. # # Added command-line option -s that prevents my2pg from attempting convert # the data (currently only timestamps). # # Better timestamps conversion. # # Revision 1.27 2002/07/16 14:54:07 fonin # Bugfix - didn't quote the fields inside PRIMARY KEY with -d option. # Fix by Milan P. Stanic . # # Revision 1.26 2002/07/14 10:30:27 fonin # Bugfix - MySQL keywords inside data (INSERT INTO sentence) were replaced # with Postgres keywords and therefore messed up the data. # # Revision 1.25 2002/07/05 09:20:25 fonin # - fixed data that contains two consecutive timestamps - thanks to # Ben Darnell # - word 'default' was converted to upper case inside the data - fixed. # Thanks to Madsen Wikholm # # Revision 1.24 2002/04/20 14:15:43 fonin # Patch by Felipe Nievinski . # A table I was re-creating had a composite primary key, and I was using # the -d switch to maintain the table and column names # adding double quotes around them. # # The SQL code generated was something like this: # # CREATE TABLE "rinav" ( # "UnidadeAtendimento" INT8 DEFAULT '0' NOT NULL, # "NumeroRinav" INT8 DEFAULT '0' NOT NULL, # -- ... # PRIMARY KEY ("UnidadeAtendimento"," NumeroRinav") # ); # # Please note the space inside the second column name string in the PK # definition. Because of this PostgreSQL was not able to create the table. # # FIXED. # # Revision 1.23 2002/02/07 22:13:52 fonin # Bugfix by Hans-Juergen Schoenig : additional space after # FLOAT8 is required. # # Revision 1.22 2001/12/06 19:32:20 fonin # Patch: On line 594 where you check for UNIQUE, I believe the regex should try # and match 'UNIQUE KEY'. Otherwise it outputs no unique indexes for the # postgres dump. # Thanks to Brad Hilton # # Revision 1.21 2001/08/25 18:55:28 fonin # Incorporated changes from Yunliang Yu : # - By default table & column names are not quoted; use the new # "-d" option if you want to, # - Use conditional substitutions to speed up and preserve # the data integrity. # Fixes by Max: # - timestamps conversion fix. Shouldn't break now matching binary data and # strings. # # Revision 1.21 2001/07/23 03:04:39 yu # Updates & fixes by Yunliang Yu # . By default table & column names are not quoted; use the new # "-d" option if you want to, # . Use conditional substitutions to speed up and preserve # the data integrity. # # Revision 1.20 2001/07/05 12:45:05 fonin # Timestamp conversion enhancement from Joakim Lemström # # Revision 1.19 2001/05/07 19:36:38 fonin # Fixed a bug in quoting PRIMARY KEYs, KEYs and UNIQUE indexes with more than 2 columns. Thanks to Jeff Waugh . # # Revision 1.18 2001/03/06 22:25:40 fonin # Documentation up2dating. # # Revision 1.17 2001/03/04 13:01:50 fonin # Fixes to make work it right with MySQL 3.23 dumps. Tested on mysqldump 8.11. # Also, AUTO_INCREMENT->SERIAL fields no more have DEFAULT and NOT NULL # definitions. # # Revision 1.16 2001/02/02 08:15:34 fonin # Sequences should be created BEFORE creating any objects \nthat depends on it. # # Revision 1.15 2001/01/30 10:13:36 fonin # Re-released under BSD-like license. # # Revision 1.14 2000/12/18 20:55:13 fonin # Better -n implementation. # # Revision 1.13 2000/12/18 15:26:33 fonin # Added command-line options. -n forces *CHAR DEFAULT '' NOT NULL to be # converted to *CHAR NULL. # AUTO_INCREMENT fields converted not in SERIAL but in # INT* NOT NULL DEFAULT nextval('seqname'). # Documentation refreshed. # Dump enclosed in single transaction from now. # # Revision 1.12 2000/12/14 20:57:15 fonin # Doublequotation bug fixed (in CREATE INDEX ON TABLE (field1,field2)) # # Revision 1.10 2000/11/27 14:18:22 fonin # Fixed bug - occasionaly was broken CREATE SEQUENCE generation # # Revision 1.8 2000/11/24 15:24:16 fonin # TIMESTAMP fix: MySQL output YYYYMMDDmmhhss to YYYYMMDD mmhhss # # Revision 1.7 2000/11/22 23:04:41 fonin # TIMESTAMP field fix. Better doublequoting. Splitting output dump # into 2 transactions - create/load/indexing first, sequence setvals then. # Added POD documentation. # # use Getopt::Std; my %opts; # command line options my $chareg=''; # CHAR conversion regexps my $dq=''; # double quote # parse command line getopts('nhds',\%opts); # output syntax if($opts{h} ne '') { usage(); exit; } # convert CHAR types from NOT NULL DEFAULT '' to NULL if($opts{n} ne '') { $chareg='\s*?(default\s*?\'\')*?\s*?not\s*?null'; } # want double quotes if($opts{d} ne '') { $dq='"'; } if($opts{s} ne '') { $safe_data_conv=1; } else { $safe_data_conv=0; } $|=1; print("------------------------------------------------------------------"); print("\n-- My2Pg 1.28 translated dump"); print("\n--"); print("\n------------------------------------------------------------------"); print("\n\nBEGIN;\n\n\n"); my %index; # contains array of CREATE INDEX for each table my %seq; # contains CREATE SEQUENCE for each table my %primary; # contains primary (eg SERIAL) fields for each table my %identifier; # contains generated by this program identifiers my $j=-1; # current position in $index{table} my @check; # CHECK constraint for current # generating full path to libtypes.c my $libtypesource='libtypes.c'; my $libtypename=`pwd`; chomp($libtypename); $libtypename.='/libtypes.so'; # push header to libtypes.c open(LIBTYPES,">$libtypesource"); print LIBTYPES "/******************************************************"; print LIBTYPES "\n * My2Pg 1.27 \translated dump"; print LIBTYPES "\n * User types definitions"; print LIBTYPES "\n ******************************************************/"; print LIBTYPES "\n\n#include \n"; print LIBTYPES "\n#define ADD_COMMA if(strcmp(result,\"\")!=0) strcat(result,\",\")\n"; # reading STDIN... my $tabledef=0; # we are outside a table definition while (<>) { if(!$tabledef && /^CREATE TABLE \S+/i){ $tabledef=1; } elsif($tabledef && /^\) type=\w*;/i){ # /^\w/i $tabledef=0; } # Comments start with -- in SQL if(/^#/) {# !/insert into.*\(.*#.*\)/i, in mysqldump output s/#/--/; } if($tabledef) { # Convert numeric types s/tinyint\(\d+\)/INT2/i; s/smallint\(\d+\)/INT2/i; s/mediumint\(\d+\)/INT4/i; s/bigint\(\d+\)/INT8/i; s/int\(\d+\)/INT4/i; s/float(\(\d+,\d*\))/DECIMAL$1/i; s/double precision/FLOAT8 /i; s/([\W])double(\(\d+,\d*\))/$1DECIMAL$2/i; s/([\W])double[\W]/$1FLOAT8 /i; s/([\W])real[\W]/$1FLOAT8 /i; s/([\W])real(\(\d+,\d*\))/$1DECIMAL$2/i; # Convert string types s/\w*blob$chareg/text/i; s/mediumtext$chareg/text/i; s/tinytext$chareg/text/i; s/\stext\s+not\s+null/ TEXT DEFAULT '' NOT NULL/i; s/(.*?char\(.*?\))$chareg/$1/i; # Old and New are reserved words in Postgres s/^(\s+)Old /${1}MyOld /; s/^(\s+)New /${1}MyNew /; # Convert DATE types s/datetime/TIMESTAMP/; s/timestamp\(\d+\)/TIMESTAMP/i; s/ date / DATE /i; if((/date/ig || /time/ig) && /[,(]\d{4}(\d{2})(\d{2})[,)]/ && $1>=0 && $1<=12 && $2>=0 && $2<=31) { s/,(\d{4})(\d{2})(\d{2}),/,'$1-$2-$3 00:00:00',/g; } # small hack - convert "default" to uppercase, because below we # enclose all lowercase words in double quotes if(!/^INSERT/) { s/default/DEFAULT/; } # Change all AUTO_INCREMENT fields to SERIAL ones with a pre-defined sequence if(/([\w\d]+)\sint.*auto_increment/i) { $tmpseq=new_name("$table_name"."_"."$+"."_SEQ",28); $seq{$table_name}=$tmpseq; $primary{$table_name}=$+; s/(int.*?) .*AUTO_INCREMENT/$1 DEFAULT nextval\('$tmpseq'\)/i; } # convert UNSIGNED to CHECK constraints if(/^\s+?([\w\d_]+).*?unsigned/i) { $check.=",\n CHECK ($dq$1$dq>=0)"; } s/unsigned//i; # Limited ENUM support - little heuristic s/enum\('N','Y'\)/BOOL/i; s/enum\('Y','N'\)/BOOL/i; # ENUM support if(/^\s+?([\w\d_]+).*?enum\((.*?)\)/i) { my $enumlist=$2; my @item; $item[0]=''; while($enumlist=~s/'([\d\w_]+)'//i) { $item[++$#item]=$1; } # forming identifier name $typename=new_name('enum_'.$table_name.'_'.$item[1],28); # creating input type function my $func_in=" int2* $typename"."_in (char *str) { int2* result; if(str==NULL) return NULL; result=(int2*)palloc(sizeof(int2)); *result=-1;"; for(my $i=0;$i<=$#item;$i++) { $func_in.=" if(strcmp(str,\"$item[$i]\")==0) { *result=$i; }"; } $func_in.=" if(*result == -1) { elog(ERROR,\"$typename"."_in: incorrect input value\"); return NULL; } return (result); }\n"; $types.="\n---"; $types.="\n--- Types for table ".uc($table_name); $types.="\n---\n"; print LIBTYPES "\n/*"; print LIBTYPES "\n * Types for table ".uc($table_name); print LIBTYPES "\n */\n"; $types.="\nCREATE FUNCTION $typename"."_in (cstring) RETURNS $typename AS '$libtypename' LANGUAGE 'c' WITH (ISSTRICT, ISCACHABLE);\n"; # creating output function my $func_out=" char* $typename"."_out (int2 *outvalue) { char* result; if(outvalue==NULL) return NULL; result=(char*)palloc(10); switch (*outvalue) {"; for(my $i=0;$i<=$#item;$i++) { $func_out.=" case $i: strcpy(result,\"$item[$i]\"); break;"; } $func_out.=" default : elog(ERROR,\"$typename"."_out: incorrect stored value\"); return NULL; break; } return result; }\n"; $func_out.="\nbool $typename"."_eq(int2* a, int2* b) { return (*a==*b); } bool $typename"."_ne(int2* a, int2* b) { return (*a!=*b); } bool $typename"."_lt(int2* a, int2* b) { return (*a<*b); } bool $typename"."_le(int2* a, int2* b) { return (*a<=*b); } bool $typename"."_gt(int2* a, int2* b) { return (*a>*b); } bool $typename"."_ge(int2* a, int2* b) { return (*a>=*b); }\n"; $types.="\nCREATE FUNCTION $typename"."_out ($typename) RETURNS cstring AS '$libtypename' LANGUAGE 'c' WITH (ISSTRICT, ISCACHABLE);\n"; $types.="\nCREATE TYPE $typename ( internallength = 2, input = $typename\_in, output = $typename\_out );\n"; $types.="\nCREATE FUNCTION $typename"."_eq ($typename,$typename) RETURNS bool AS '$libtypename' LANGUAGE 'c'; CREATE FUNCTION $typename"."_lt ($typename,$typename) RETURNS bool AS '$libtypename' LANGUAGE 'c'; CREATE FUNCTION $typename"."_le ($typename,$typename) RETURNS bool AS '$libtypename' LANGUAGE 'c'; CREATE FUNCTION $typename"."_gt ($typename,$typename) RETURNS bool AS '$libtypename' LANGUAGE 'c'; CREATE FUNCTION $typename"."_ge ($typename,$typename) RETURNS bool AS '$libtypename' LANGUAGE 'c'; CREATE FUNCTION $typename"."_ne ($typename,$typename) RETURNS bool AS '$libtypename' LANGUAGE 'c'; CREATE OPERATOR < ( leftarg = $typename, rightarg = $typename, -- negator = >=, procedure = $typename"."_lt ); CREATE OPERATOR <= ( leftarg = $typename, rightarg = $typename, -- negator = >, procedure = $typename"."_le ); CREATE OPERATOR = ( leftarg = $typename, rightarg = $typename, commutator = =, -- negator = <>, procedure = $typename"."_eq ); CREATE OPERATOR >= ( leftarg = $typename, rightarg = $typename, negator = <, procedure = $typename"."_ge ); CREATE OPERATOR > ( leftarg = $typename, rightarg = $typename, negator = <=, procedure = $typename"."_gt ); CREATE OPERATOR <> ( leftarg = $typename, rightarg = $typename, negator = =, procedure = $typename"."_ne );\n"; print LIBTYPES $func_in; print LIBTYPES $func_out; s/enum\(.*?\)/$typename/i; } # SET support if(/^\s+?([\w\d_]+).*?set\((.*?)\)/i) { my $setlist=$2; my @item; $item[0]=''; my $maxlen=0; # maximal string length while($setlist=~s/'([\d\w_]+)'//i) { $item[++$#item]=$1; $maxlen+=length($item[$#item])+1; } $maxlen+=1; my $typesize=int($#item/8); if($typesize<2) { $typesize=2; } $internalsize=$typesize; $typesize='int'.$typesize; $typename=new_name('set_'.$table_name.'_'.$item[1],28); # creating input type function my $func_in=" $typesize* $typename"."_in (char *str) { $typesize* result; char* token; if(str==NULL) return NULL; result=($typesize*)palloc(sizeof($typesize)); *result=0; if(strcmp(str,\"\")==0) return result; for(token=strtok(str,\",\");token!=NULL;token=strtok(NULL,\",\")) {"; for(my $i=0,my $j=1;$i<=$#item;$i++,$j*=2) { $func_in.=" if(strcmp(token,\"$item[$i]\")==0) { *result|=$j; continue; }"; } $func_in.=" } if(*result == 0) { elog(ERROR,\"$typename"."_in: incorrect input value\"); return NULL; } return (result); }\n"; $types.="\n---"; $types.="\n--- Types for table ".uc($table_name); $types.="\n---\n"; print LIBTYPES "\n/*"; print LIBTYPES "\n * Types for table ".uc($table_name); print LIBTYPES "\n */\n"; $types.="\nCREATE FUNCTION $typename"."_in (cstring) RETURNS $typename AS '$libtypename' LANGUAGE 'c';\n"; # creating output function my $func_out=" char* $typename"."_out ($typesize *outvalue) { char* result; int i; if(outvalue==NULL) return NULL; result=(char*)palloc($maxlen); strcpy(result,\"\"); for(i=1;i<=2 << (sizeof(int2)*8);i*=2) { switch (*outvalue & i) {"; for(my $i=0,$j=1;$i<=$#item;$i++,$j*=2) { $func_out.=" case $j:"; if($item[$i] ne '') { $func_out.="ADD_COMMA;"; } $func_out.="strcat(result,\"$item[$i]\"); break;"; } $func_out.=" default : break; } } return result; }\n"; $func_out.="\nbool $typename"."_eq($typesize* a, $typesize* b) { return (*a==*b); } $typesize find_in_set($typesize *a, $typesize *b) { int i; for(i=1;i<=sizeof($typesize)*8;i*=2) { if(*a & *b) { return 1; } } return 0; } \n"; $types.="\nCREATE FUNCTION $typename"."_out ($typename) RETURNS cstring AS '$libtypename' LANGUAGE 'c';\n"; $types.="\nCREATE TYPE $typename ( internallength = $internalsize, input = $typename\_in, output = $typename\_out );\n"; $types.="\nCREATE FUNCTION $typename"."_eq ($typename,$typename) RETURNS bool AS '$libtypename' LANGUAGE 'c'; CREATE FUNCTION find_in_set ($typename,$typename) RETURNS bool AS '$libtypename' LANGUAGE 'c'; CREATE OPERATOR = ( leftarg = $typename, rightarg = $typename, commutator = =, procedure = $typename"."_eq ); CREATE OPERATOR <> ( leftarg = $typename, rightarg = $typename, commutator = <>, negator = =, procedure = $typename"."_eq ); \n"; print LIBTYPES $func_in; print LIBTYPES $func_out; s/set\(.*?\)/$typename/i; } # Change multy-field keys to multi-field indices # MySQL Dump usually ends the CREATE TABLE statement like this: # CREATE TABLE bids ( # ... # PRIMARY KEY (bids_id), # KEY offer_id (offer_id,user_id,the_time), # KEY bid_value (bid_value) # ); # We want to replace this with smth like # CREATE TABLE bids ( # ... # PRIMARY KEY (bids_id), # ); # CREATE INDEX offer_id ON bids (offer_id,user_id,the_time); # CREATE INDEX bid_value ON bids (bid_value); if (s/CREATE TABLE (.*) /CREATE TABLE $dq$1$dq /i) { if($oldtable ne $table_name) { $oldtable=$table_name; $j=-1; $check=''; if($seq{$table_name} ne '') { print "\n\n--"; print "\n-- Sequences for table ".uc($table_name); print "\n--\n"; print "\nCREATE SEQUENCE ".$seq{$table_name}.";\n\n"; } print $types; $types=''; $dump=~s/,\n\).*;/\n\);/gmi; # removing table options after closing bracket: # ) TYPE=ISAM PACK_KEYS=1; $dump=~s/\n\).*/\n\);/gmi; print $dump; $dump=''; } $table_name=$1; } # output CHECK constraints instead UNSIGNED modifiers if(/PRIMARY KEY\s+\((.*)\)/i) { my $tmpfld=$1; $tmpfld=~s/,/","/g if $dq; $tmpfld=~s/ //g; s/PRIMARY KEY\s+(\(.*\))/PRIMARY KEY \($dq$tmpfld$dq\)/i; s/(PRIMARY KEY \(.*\)).*/$1$check\n/i; } if(/^\s*KEY ([\w\d_]+)\s*\((.*)\).*/i) { my $tmpfld=$2; my $ky=$1; $tmpfld=~s/\s*,\s*/","/g if $dq; $tmpfld=~s/(\(\d+\))//g; $index{$table_name}[++$j]="CREATE INDEX ${ky}_$table_name\_index ON $dq$table_name$dq ($dq$tmpfld$dq);"; } if(/^\s*UNIQUE.*?([\w\d_]+)\s*\((.*)\).*/i) { my $tmpfld=$2; my $ky=$1; $tmpfld=~s/,/","/g if $dq; $tmpfld=~s/(\(\d+\))//g; $index{$table_name}[++$j]="CREATE UNIQUE INDEX ${ky}_$table_name\_index ON $dq$table_name$dq ($dq$tmpfld$dq);"; } s/^\s*UNIQUE (.+).*(\(.*\)).*\n//i; s/^\s*KEY (.+).*(\(.*\)).*\n//i; if($dq && !/^\s*(PRIMARY KEY|UNIQUE |KEY |CREATE TABLE|INSERT INTO|\);)/i) { s/\s([A-Za-z_\d]+)\s/ $dq$+$dq /; } } # end of if($tabledef) s/INSERT INTO\s+?(.*?)\s+?/INSERT INTO $dq$1$dq /i; # if not defined -s command-line option (safe data conversion), # attempting to convert timestamp data if(!$safe_data_conv) { # Fix timestamps s/'0000-00-00/'0001-01-01/g; # may corrupt data !!! s/([,(])00000000000000(?=[,)])/$1'00010101 000000'/g; if(/[,(]\d{4}(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})[,)]/ && $1>=0 && $1<=12 && $2>=0 && $2<=31 && $3>=0 && $3<=23 && $4>=0 && $4<=59 && $5>=0 && $5<=59) { s/([,(])(\d{8})(\d{6})(?=[,)])/$1'$2 $3'/g; } if(/[,(]\d{4}(\d{2})(\d{2})[,)]/ && $2>=0 && $2<=12 && $3>=0 && $3<=31) { s/([,(])(\d{4})(\d{2})(\d{2})(?=[,)])/$1'$2-$3-$4 00:00:00'/g; } } $dump.=$_; } if($seq{$table_name} ne '') { print "\n\n--"; print "\n-- Sequences for table ".uc($table_name); print "\n--\n"; print "\nCREATE SEQUENCE ".$seq{$table_name}.";\n\n"; } print $types; $dump=~s/,\n\).*;/\n\);/gmi; $dump=~s/\n\).*/\n\);/gmi; print $dump; # Output indices for tables while(my($table,$ind)=each(%index)) { print "\n\n--"; print "\n-- Indexes for table ".uc($table); print "\n--\n"; for(my $i=0;$i<=$#{$ind};$i++) { print "\n$ind->[$i]"; } } while(my($table,$s)=each(%seq)) { print "\n\n--"; print "\n-- Sequences for table ".uc($table); print "\n--\n"; # setting SERIAL sequence values right if($primary{$table} ne '') { print "\nSELECT SETVAL('".$seq{$table}."',(select case when max($dq".$primary{$table}."$dq)>0 then max($dq".$primary{$table}."$dq)+1 else 1 end from $dq$table$dq));"; } } print("\n\nCOMMIT;\n"); close(LIBTYPES); open(MAKE,">Makefile"); print MAKE "# # My2Pg \$Revision: 1.13 $ \translated dump # Makefile # all: libtypes.so libtypes.o: libtypes.c gcc -c -fPIC -g -O libtypes.c libtypes.so: libtypes.o ld -Bshareable -o libtypes.so libtypes.o"; close(MAKE); # # Function generates unique identifier # Args : template name, max length # Globals: %identifier # sub new_name() { my $name=lc(shift @_); my $len=shift @_; # truncate long names if(length($name)>$len) { $name=~s/(.{$len}).*/$1/i; } # find reserved identifiers if($identifier{$name}!=1) { $identifier{$name}=1; return $name; } else { for(my $i=1,my $tmpname=$name.$i;$identifier{$tmpname}!=1;) { $tmpname=$name.$i } $identifier{$tmpname}=1; return $tmpname; } die "Error during unique identifier generation :-("; } sub usage() { print < Copyright (c) 2000 Valentine Danilchuk This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the code source for license details. SYNTAX: my2pg [-hnds] OPTIONS: h - this help n - convert *CHAR NOT NULL DEFAULT '' types to *CHAR NULL d - double quotes around table and column names s - do not attempt to convert data (timestamps at the moment) EOF ; } =head1 NAME my2pg - MySQL -> PostgreSQL dump conversion utility. =head1 SYNTAX mysqldump db | ./my2pg.pl [-nds] > pgsqldump.sql vi libtypes.c make psql database < pgsqldump.txt where =over 4 =item F - file suitable for loading into PostgreSQL. =item F - C source for emulated MySQL types (ENUM, SET) generated by B =back =head1 OVERVIEW B utility attempts to convert MySQL database dump to Postgres's one. B performs such conversions: =over 4 =item * Type conversion. It tries to find proper Postgres type for each column. Unknown types are silently pushing to output dump; ENUM and SET types implemented via user types (C source for such types can be found in B file); =item * Encloses identifiers into double quotes. All column and table names should be enclosed to double-quotes to prevent conflict with reserved SQL keywords; =item * Converting AUTO_INCREMENT fields to SERIAL. Actually, creating the sequence and setting default value to nextval('seq'), well, you know :) =item * Converting KEY(field) to CREATE INDEX i_field on table (field); =item * The same for UNIQUE keys; =item * Indices are creating AFTER rows insertion (to speed up the load); =item * Translates '#' MySQL comments to ANSI SQL '--' =back It encloses dump in transaction block to prevent single errors during data load. =head1 COMMAND-LINE OPTIONS My2pg takes the following command-line options: =over 2 =item -n Convert *CHAR DEFAULT '' NOT NULL types to *CHAR NULL. Postgres can't load empty '' strings in NOT NULL fields. =item -d Add double quotes around table and column names =item -h Show usage banner. =item -s Do not attempt to convert data. Currently my2pg only tries to convert date and time data. =back =head1 SIDE EFFECTS =over 4 =item * creates file B in current directory overwriting existed file without any checks; =item * the same for Makefile. =back =head1 BUGS Known bugs are: =over 4 =item * Possible problems with the timestamp data. PostgreSQL does not accept incorrect date/time values like B<2002-00-15>, while MySQL does not care about that. Currently my2pg cannot handle this issue. You should care yourself to convert such a data. =item * Use -s option if your numeric data are broken during conversion. My2pg attempts to convert MySQL timestamps of the form B to B and B to B. It performs some heuristic checks to ensure that the month,day,hour,minutes and seconds have values from the correct range (0..12, 0..31, 0..23, 0..59, 0..59 respectively). It is still possible that your numeric values that satisfy these conditions will get broken. =item * Possible problems with enclosing identifiers in double quotes. All identifiers such as table and column names should be enclosed in double quotes. Program can't handle upper-case identifiers, like DBA. Lower-case identifiers are OK. =item * SET type emulation is not full. LIKE operation on SETs, raw integer input values should be implemented =item * B generated during output is platform-dependent and surely works only on Linux/gcc (FreeBSD/gcc probably works as well - not tested) =item * Generated B contain line #include This file may be located not in standard compiler include path, you need to check it before compiling. =back =head1 AUTHORS B<(c) 2000-2002 Maxim V. Rudensky (fonin@ziet.zhitomir.ua)> (developer, maintainer) B<(c) 2000 Valentine V. Danilchuk (valdan@ziet.zhitomir.ua)> (original script) =head1 CREDITS Great thanks to all those people who provided feedback and make development of this tool easier. Jeff Waugh Joakim Lemström || Yunliang Yu Brad Hilton If you are not listed here please write to me. =head1 LICENSE B =cut