Thursday, September 29, 2011

Migrating Rails from MySQL to Oracle

Assuming you have MySQL, Oracle, and Oracle Client (with sqlplus) setup and have already created the schema, then you just do the following: (We created the schema via Rails, first with rake db:schema:dump then rake db:schema:load after adding names to indexes and shortening column names and reference to those attributes. The benefit of doing it that way was to ensure the correct datatypes for Active Record would be created.)

mysqldump -u your_username --password='your_password'  name_of_mysql_database --compatible=oracle --no-create-info --no-create-db --skip-opt --skip-set-charset --skip-extended-insert --skip-comments --skip-quote-names --compact > /some/path/backup.sql
cd /some/path/
grep ដ backup.sql
# Look at the output of grep. If it returns anything, change the unicode character here and later.
# Changes 2 sequential backslashes to a unicode character temporarily.
sed -e "s/\\\\\\\\/ដ/g" backup.sql > backup.sql.1
# Unescape characters that mysqldump escapes.
# Note: there is a good chance that the escape character replacements will fail to work when executed. Because we had a table with heavy use of escape characters that
# made sqlplus choke. You might want to look at the generated SQL and adjust these accordingly. Possibly characters escaped at beginning or end of value or next to each other won't work without more replacements to make those valid, since double single-quote is an escaped single-quote in SQL.
sed -e "s/\\\\'/''/g" \
    -e "s/\\\\0/'||Chr(0)||'/g" \
    -e "s/\\\\a/'||Chr(7)||'/g" \
    -e "s/\\\\b/'||Chr(8)||'/g" \
    -e "s/\\\\t/'||Chr(9)||'/g" \
    -e "s/\\\\n/'||Chr(10)||'/g" \
    -e "s/\\\\f/'||Chr(12)||'/g" \
    -e "s/\\\\r/'||Chr(13)||'/g" \
    -e "s/\\\\e/'||Chr(27)||'/g" \
    -e "s/\\\\"/"/g" \
    backup.sql.1 > backup.sql.2
# The first expression changes the unicode char back to 2 sequential backslashes. The second two expressions fix dates. You may need to alter the date format.
sed -e "s/ដ/\\\\/g" \
    -e "s/'[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9]'/to_date(&,'yyyy-mm-dd HH24:MI:SS')/g" \
    -e "s/'[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]'/to_date(&,'yyyy-mm-dd')/g" \
    backup.sql.2 > backup.sql.3
sqlplus oracle_username/oracle_password@host:port/SID

The following is done in sqlplus:

-- this makes ampersands ok
SET SCAN OFF
-- this commits after every insert
SET AUTOCOMMIT ON
-- this outputs the insert, so that when there is an error, you know what it was trying to execute
SET ECHO ON
-- this eliminates output stating how many rows were created
SET FEEDBACK OFF
-- this shows the current time next to each statement execution, so in the end you know how long it took and when there were errors
SET TIME ON
@/some/path/backup.sql.3

Basically, this does the export, does text alterations to unescape and Oracle SQL-fy, and makes the datetimes and dates into Oracle datetimes and dates. Then it imports it. This may not work for you, or might require adjustments to work properly.

Note: we first tried Oracle SQLDeveloper to migrate the data. It required dba access for the schema user, created two users, granted one of them various rights, and failed in the end. Maybe it would work for you, though.

When you're done, to test out that the data is the same between the MySQL and Oracle you could try using my schema-dump utility to output the data in both databases and diff the two files.

Pentaho was mentioned as a possibility for a migration tool in a Stack Overflow thread about MySQL to PostgreSQL migration, so you might look into that also.

No comments: