12/03/2023
Replication - check slave integrity
Jump to navigationJump to search
mk-table-checksum is an excellent tool for verifying whether slave is in sync with the master. I have tested it in one of the mysql setup in MAS servers. The steps to follow are,
On the master,
Create a database. You can name anything. In this case I have named it test.
Then create a table to store the checksum values. Here I call it checksum. The syntax for creating table is
CREATE TABLE checksum (
db char(64) NOT NULL,
tbl char(64) NOT NULL,
chunk int NOT NULL,
boundaries char(100) NOT NULL,
this_crc char(40) NOT NULL,
this_cnt int NOT NULL,
master_crc char(40) NULL,
master_cnt int NULL,
ts timestamp NOT NULL,
PRIMARY KEY (db, tbl, chunk)
);
Then create a user like
grant all on *.* to 'slvchk'@10.10.2.26 identified by 'chkSum';
where slvchk is the username and chkSum is the pasword. 10.10.2.26 is the ip of the master.
Then execute the following command...
mk-table-checksum -u slvchk -pchkSum --ignore-databases mysql,information_schema,test,mysql_monitor,remote --replicate test.checksum
--empty-replicate-table 10.10.2.26
(--ignore-databases is used to ignore the databases we dont want to check. --empty-replicate-table deletes any previous checksums existing in the checksum table.)
The above command will populate the checksum table with the checksum value. As a result of replication, the the checksum table will be populated with data in the slave also.
During it's ex*****on, keep an eye on the load, as it might increase the load on the system. So better do it in off-peak hours on production servers.
We get an output like the following..
DATABASE TABLE CHUNK HOST ENGINE COUNT CHECKSUM TIME WAIT STAT LAG
thefinerthings08 acl_actions 0 10.16.55.86 InnoDB 70 43247b13 0 NULL NULL NULL
thefinerthings08 acl_permissions 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
thefinerthings08 addresses 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
thefinerthings08 amazon_unsync_ipns 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
thefinerthings08 attachments 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
thefinerthings08 auto_status_changes 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
thefinerthings08 autocomplete_searches 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
thefinerthings08 autocomplete_stopwords 0 10.16.55.86 InnoDB 233 a4dc9c1b 0 NULL NULL NULL
thefinerthings08 ban_countries 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
thefinerthings08 ban_email_domains 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
thefinerthings08 ban_ips 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
thefinerthings08 ban_referrers 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
thefinerthings08 batch_print_files_content 0 10.16.55.86 MyISAM 0 NULL 0 NULL NULL NULL
thefinerthings08 batch_prints 0 10.16.55.86 MyISAM 0 NULL 0 NULL NULL NULL
thefinerthings08 bundles 0 10.16.55.86 InnoDB 4 414aaf5a 0 NULL NULL NULL
thefinerthings08 cache 0 10.16.55.86 InnoDB 2 6c5d45d3 0 NULL NULL NULL
thefinerthings08 cart_swap 0 10.16.55.86 InnoDB 0 NULL 0 NULL NULL NULL
Wait for the above command to finish. It will take time depending on the no and size of databases. So better do it in screen.
When this command has finished executing, check the slave lag.
On the slave,
show slave status\G
If the seconds behind master value is not zero, wait for it to become zero.
After making sure seconds behind master is zero, execute the following command on master...
On the master,
mk-table-checksum -u slvchk -pchkSum --ignore-databases mysql,information_schema,test,mysql_monitor,remote --replicate test.checksum
--replicate-check 1 10.10.2.26
(--replicate check 1 will check the masters' slaves upto one level, ie 1 slave. If 2, then upto two levels ie two slaves)
You will get an output like the following...
Differences on P=3306,h=10.16.55.75
DB TBL CHUNK CNT_DIFF CRC_DIFF BOUNDARIES
admin_homeseason addresses 0 0 1 1=1
admin_homeseason categories 0 0 1 1=1
admin_homeseason cms_pages 0 0 1 1=1
admin_homeseason crons 0 0 1 1=1
admin_homeseason errors 0 0 1 1=1
admin_homeseason files 0 0 1 1=1
admin_homeseason ip2country 0 0 1 1=1
admin_homeseason items 0 0 1 1=1
admin_homeseason semaphores 0 0 1 1=1
admin_homeseason sites 0 0 1 1=1
admin_homeseason taxes 0 0 1 1=1
admin_homeseason users 0 0 1 1=1
admin_homeseason user_groups 0 0 1 1=1
admin_homeseason vars 0 0 1 1=1
admin_powerswabs categories 0 0 1 1=1
admin_powerswabs crons 0 0 1 1=1
admin_powerswabs errors 0 0 1 1=1
admin_powerswabs ip2country 0 0 1 1=1
In the above output 10.16.55.75 is the slave of this master.
The 4th column CNT_DIFF column is what we are interested in.
A CNT_DIFF of zero means slave is in sync with the master,
If CNT_DIFF is a positive integer value, then there is that much rows extra on slave(this should not happen in a master-slave setup).
If CNT_DIFF is a negative integer value, then there is that much rows missing on slave.
After finding the tables with differences, we can use mk-table-sync to sync the slave to master.