As we all know,there're three kinds of replication in MySQL nowadays.Such as,asynchronous replication,(full)synchronous replication,semi-synchronous replication.What's the difference between them?First of all,let's see the intact architecture picture of MySQL replication:
What will client do?
commits transactions to master
receives results from master
What will master do?
- executes transactions
- generates binary logs
- dump thread sends contents(binary logs) to slave
- returns results to client
What will slave do?
- connects to master
- IO Thread asks for data(binary logs) and gets them
- generates relay logs
- SQL Thread applies data(relay logs)
Method of different MySQL Replication
Generally speaking,the data changed on master will be continuously sent to slave.So the data on slave seems to be equal with the master.This mechanism is usually used to backup on slave(reduce the pressure of master),construct HA architecture(failover or separate reading/writing operations),etc.
Nevertheless,on account of different reasons,slave frequently defers in almost all the scenarios what's often grumbled by MySQL dba.Below are different kinds of MySQL replication.Let's see the details.
- asynchronous replication
金沙官网线上， **Since MySQL 3.2.22,this kind of replication was supported with statement format of binary log.Then,untill MySQL 5.1.5,row format of binary log was supported either.The mechanism of it is that as soon as the master dump thread has sent the binary logs to the slave,the master server returns the result to client.There's nothing to guarantee the binary logs are normally received by the slave(maybe the network failure occurs simultaneously).So it's unsafe in consistency what means your transactions will lose in the replication.This is also the original replication of MySQL.Here's the picture about the procedure:
1. Client sends dml operations to the master while the transaction starts.
2. Master executes these dml operations from client in transaction.
3. Generates some binary logs which contains the transaction information.
4. Master will return results to the client immediately after dump thread has sent these binary logs to slave.
5. Slave receives the binary logs by IO_Thread and apply the relay logs by SQL_Thread.
In step 4,master won't judge whether slave has received the binary logs (which are sent by itself) or not.If the master crashs suddenly after it has sent the binary logs,but slave does not receive them at all on account of network delay.Only if the slave takes over the application at this time,the committed transactions will miss which means data loss.This is not commonly acceptable in most important product systems especially in the financial ones.
- synchronous replication
Synchronous replication requires master to return results to client only after the transactions have been committed by all the slaves(receive and apply).This method will severely lead to bad performance on master unless you can guarantee the slaves can commit immediate without any delay(infact it's tough).Now,the only solution of synchronous replication is still the MySQL NDB Cluster.Therefore,it's not recommended to use synchronous replication way.
- semi-synchronous replication
Semi-synchronous replication seems a workaround of above two method which can strongly increase the consistency between master and slave.It's supported since MySQL 5.5 and enhanced in MySQL 5.7.What's the mechanism of semi-sychronouos replication?Master is permitted to return the result to client merely after only one slave has received binary logs,write them to the relay logs and returns an ACK signal to master.There're two ways of it,that is,after_commit & after_sync.Let's see the difference of them:
after_commit(Since MySQL 5.5):
In this method,master performs a commit before it receives ACK signal from slave.Let's suppose a situation that once master crashs after it commits a transaction but it hasn't receive the ACK signal from slave.Meanwhile,failover makes slave become the new master.How does the slave deal with then?Will the transaction lose?It depends.There're two scenarios:
- Slave has received the binary log,and then turns it into relay log and applys it.There's no transaction loss.
- Slave hasn't received the binary log,the transaction committed by master just now will lose,but the client won't fail(only inconsistent in replication).
Therefore,after_commit cannot guarantee lossless replication.after_commit is the default mode(actually it's the only mode can be use) which is supported by MySQL 5.5 & 5.6.
after_sync(since MySQL 5.7):
In the picture above,the t1 transaction shouldn't be lost because of the master merely commits to the storage engine after receive the ACK signal from slave.In spite of master may crash before receiving ACK signal,no transaction will lose as the master hasn't commit at all.Meanwhile,the t2 transaction also get consistent query here.
In order to improve the data consistency(since after_commit has avoidless deficiency),MySQL official enhances the semi-synchronous replication which can be called "loss-less semi-synchronous replication" in MySQL 5.7 by add after_sync mode in parameter "rpl_semi_sync_master_wait_point".
Caution,semi-synchronous replication may turn into asynchronous replication whenever the delay time of slave surpass the value which is specified in parameter "rpl_semi_sync_master_timeout"(default values is 10000 milliseconds).Why it is permitted?I'm afraid in order to consider the performance of master.Notwithstanding,you can also play a trick to prevent it from being converted over by set a infinite number in this parameter such as "10000000" or above.Especially in case that your product system is too important to not lose data.
Further more,to configure semi-sychronous replication,you should implement the optional plugin component "rpl_semi_sync_master",which can be check by using command "show plugins;"
- Commonly,semi-sync replication is strongly recommended when implements MySQL replication nowadays(with gtid).
- I utterly recommend to upgrade product system to MySQL 5.7 in order to use "after_sync" mode which can avoid data loss.
- Be careful of specify an inappropriate value in parameter "rpl_semi_sync_master_timeout" which will cause converting semi-sync to async replication.