I’ve been fixing
and breaking MySQL replication clusters for quite some time now, and I realized one of the biggest problems is the lack of consistency provided through MySQL replication.
Sure your data will be consistent most of the time, but how do you check if it really IS consistent across all your slaves? How do you make sure your slaves don’t have missing or invalid entries?
I’m sure you’ve all run:
mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
Not always a good idea…
Well today I present you with a little bash script I’ve written which performs all these verifications. I haven’t invented anything. On the contrary, I’m just using the methods and tools provided by Percona in their fantastic toolset called Maatkit.
- Change the default “User Defined Variables” in the script, to reflect your MASTER mysql server.
- Configure slave reporting on each slave so that “SHOW SLAVE HOSTS\G” works from the MASTER
- Make sure your slaves are running properly: Slave_IO_Running: Yes, Slave_SQL_Running: Yes, Seconds_Behind_Master: 0
- Make sure you have Maatkit installed (apt-get install maatkit)
- Run the script on the MASTER like this: ./mysql_consistency.sh -c
I have tested this script on Debian Lenny (5.0) with maatkit version 4334-1 and MySQL 5.0.
How does it work?
When you run the script, after performing some necessary sanity checks, the MASTER will create a checksum of every database and every table. It will store those results in the default database called test in the table called checksum. It will then replicate the data to the SLAVES who will create their own checksums on the same databases and tables. Afterwards it will tell you which slaves are consistent and which ones are not.
[root@db01 /opt (353)]#: ./mysql_consistency.sh -c
Replication Slave ID 3 on 172.16.0.63:3306 is consistent.
Replication Slave ID 4 on 172.16.0.64:3306 is consistent.
Replication Slave ID 5 on 172.16.0.65:3306 is inconsistent. Requires rebuild
You might get some error messages too.
Download the script here: mysql_consistency.sh
Please notify me in the comments of any errors or adjustements as I’ve only used this in a small test-environment.