Migrating a database to AWS RDS with minimal downtime

Posted onOctober 6, 2021 by Ben

A little while back our company decided it was time to move our self managed MySQL database into AWS’s managed Relational Database Service (RDS). Surprisingly, I found there to be no single repository of instructions on how to perform what would seem to be a common use case. Below is the result of many many hours of research and testing.

The migration happens in four phases.

Copy database structure without data (or triggers) and copy users/permissions
Copy the data
Replication
Change over

Preparation

Create a new database in RDS (We will refer to this as “the new database” for the remainder of this article)
Check that your database connected applications can connect to the new database. You may need to add their IPs to the RDS security group or set up VPC pairing depending on your set up.

The simplest way to test connectivity is by running a simple query like:

mysql -h <rds_endpoint> -P 3306 -u -p -e "SELECT 1;"

(You may also run into an issue if a database user connects to the current database via an internal AWS IP and now connects to the new database via a public IP or vice versa. In this case you’d have to add a new user user@new-ip with the same permissions as user@old-ip.)

Prepare your current database for replication by turning on binary logging and setting a unique server id. You can do that by adding the following to your my.cnf file and then restarting mysql. (see here)

[mysqld]
log_bin=mysql-bin
binlog_format=MIXED
server-id=1

Note: The following scripts assume that you can ssh into the current database and can connect to the new database from there. You can easily modify these scripts for other setups. Make sure you have access to both databases and users with sufficient permissions.

Phase 1a – Database structure

*Special thanks to Islam E for his great article that helped with this section.

Step 1 – Dump tables and views

mysqldump --user=<username> -p \
   --databases <database_name> [, <database_name>] \
   --skip-lock-tables --skip-triggers --single-transaction --no-data -f \
   -r structure.sql

Step 2 – Dump stored procedures and functions

mysqldump --user=<username> -p \
   --databases <database_name> [, <database_name>] \
   --no-create-db --no-create-info \
   --skip-lock-tables --skip-triggers --single-transaction --no-data -f \
   --compress --routines --events \
   -r routines.sql

(Note --skip-triggers. Importing the triggers at this point will cause the replication to fail. We will import the triggers later on.)

Step 3 – Remove DEFINER line from the backup files

The root user that may have been used to create your database does not exist in the new RDS database. Removing the DEFINER lines from the previously generated sql files will resolve that issue.

sed 's/\sDEFINER=`[^`]*`@`[^`]*`//g' -i structure.sql
sed 's/\sDEFINER=`[^`]*`@`[^`]*`//g' -i routines.sql

Step 4 – Import tables, views, procedures and functions

mysql -h <rds_endpoint> -P 3306 -u <username> -p -f < structure.sql
mysql -h <rds_endpoint> -P 3306 -u <username> -p -f < routines.sql

Phase 1b – Users and permissions

*Special thanks to Sumit P for his great article that helped with this section.

Step 5 – Get a list of all users excluding root, localhost users and proxy users.

mysql -B -N -u<username> -p -e "SELECT concat(\"'\",user,\"'@'\",host,\"'\") FROM mysql.user WHERE user <> 'root' AND host not in ('localhost','127.0.0.1') AND user not in (select user from mysql.proxies_priv where proxied_user <> '')" > mysql_all_users.txt 2> /dev/null

Step 6 – Generate file to recreate users and permissions

This is a loop that runs a mysql command for each user. In order to not have to repeatedly enter the database password you should include the password in the script instead of leaving -p alone as in the previous scripts. (You can put this command in file and run the file as a shell script if you don’t want to leave the password in the history)

while read line; do mysql -B -N -u<username> "-p<password>" -e "SHOW GRANTS FOR $line"; done < mysql_all_users.txt > mysql_all_users_sql.sql 2> /dev/null

Step 7 – Remove and/or replace permissions that are not supported by RDS

Since RDS is a managed solution many admin level permissions are not allowed by AWS. Operations typically performed with these permissions can be done through the UI.

# Remove unavailable permissions
sed -i 's/$/;/;s/, FILE//g;s/, SHUTDOWN//g;s/, CREATE TABLESPACE//g;s/, SUPER//g' mysql_all_users_sql.sql
# Replace 'GRANT ALL PRIVILEGES' with specific permissions
SUPER_USER_PERMISSIONS="GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, PROCESS, REFERENCES, INDEX, ALTER, SHOW DATABASES, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, REPLICATION CLIENT, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER ON *.* WITH GRANT OPTION, REPLICATION SLAVE TO"
sed -i "s/GRANT ALL PRIVILEGES ON \*\.\* TO/${SUPER_USER_PERMISSIONS}/g" mysql_all_users_sql.sql
sed -i "s/WITH GRANT OPTION, REPLICATION SLAVE//g" mysql_all_users_sql.sql

Step 9 – Import users

mysql -h <rds_endpoint> -P 3306 -u <username> -p -f < mysql_all_users_sql.sql

Phase 2 – Copy the data

At this point your new database should contain a bunch of empty tables and all of the users/permissions from the current database. The triggers should not have been copied yet as this will conflict with replication. Each step in this next phase may take a long time depending on the size of your database. You should use a tool like screen or tmux that’ll allow you to push a shell instance into the background.

Step 1 – Dump data from current database

mysqldump --user=<username> -p \
   --databases <database_name> [, <database_name>] \
   --master-data=2 \
   --no-create-db --no-create-info --skip-triggers \
   --single-transaction --order-by-primary \
   --routines=0 --triggers=0 --events=0 -f \
   -r database_backup.sql

Step 2 – Import data into new database

mysql -h <rds_endpoint> -P 3306 -u <username> -p -f < database_backup.sql

Step 3 – Get MASTER_LOG_FILE and MASTER_LOG_POS for replication

Save the output of this command for later. This is very important!

head -30 database_backup.sql | grep "CHANGE MASTER TO MASTER_LOG_FILE"

Phase 3 – Replication

Step 1 – Create a replication user

On the current database create a replication user:

CREATE USER 'repl_user'@'%' IDENTIFIED BY 'changeme';
GRANT REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'repl_user'@'%';

Step 2 – Start replication

Use the MASTER_LOG_FILE and MASTER_LOG_POS from Phase 2 Step 3 to run the following commands on the new database.

CALL mysql.rds_set_external_master ('<IP of master>', 3306, 'repl_user', 'changeme', 'master log file', MASTER_LOG_POS, 0);
CALL mysql.rds_start_replication;

Depending on how long it’s been since taking the database backup in phase 2 step 1 it may take some time for the new database to catch up to the current database. You can check on the replication status by running SHOW SLAVE STATUS; on the new database. When the replication has caught up it’ll show that it is 0 seconds behind the master.

At this point the new database should mirror the current database. Every change to the current database will be replicated to the new database. You should do thorough testing to confirm that this is true.

You can check in on the replication status (SHOW SLAVE STATUS;) to confirm that the replication did not run into any errors and is otherwise running smoothly. If replication does fail at any time, that command will show what the error was.

If replication fails on a particular query you can skip that query by running CALL mysql.rds_skip_repl_error; Make sure that is what you want before going that way. Sometimes you’re better off stopping replication ( CALL mysql.rds_stop_replication; CALL mysql.rds_reset_external_master;), fixing the issue and starting over again from the beginning.

When you are ready to make the switch continue below.

Phase 4 – Change Over

Step 1 – Shut it all down!

Shut down ALL applications that connect to the current database. The last thing you want are multiple systems writing to separate databases. You want this to be a clean break.

Check the general logs table to see all queries being made to the database (and make sure there are none):

SELECT * FROM mysql.general_log;

After confirming that all applications are disconnected from the old database check that the replication is up to date. Run SHOW SLAVE STATUS; and confirm that replication is 0 seconds behind master.

Step 2 – Stop replication

# Stop replication
CALL mysql.rds_stop_replication;
# Reset external master
CALL mysql.rds_reset_external_master;

Step 3 – Migrate triggers

mysqldump --user=<username> -p \
    --databases <database_name> [, <database_name>] \
    --no-create-db --no-create-info \
    --skip-lock-tables --single-transaction --no-data --compress --skip-opt -f \
    -r triggers.sql

sed 's/\sDEFINER=`[^`]*`@`[^`]*`//g' -i triggers.sql

mysql -h <rds_endpoint> -P 3306 -u <username> -p -f < triggers.sql

Step 4. Bring your applications back up

One at a time, update the connection string in each app to point to the new RDS database instance and turn it back on.

2 thoughts on “Migrating a database to AWS RDS with minimal downtime”

Islam says:

April 21, 2022 at 7:34 am

Thank you for sharing a link to my article!
Awesome content, I’ll share it as well 🤩

1. Ben says:
  
  September 8, 2022 at 12:53 am
  
  Of course and Thank you!