A Complete solution for Database administration: Performance Improvements in MySQL 8.0 Replication

MySQL 8.0 became Generally Available (GA) on April 19th, a great moment for us working on MySQL at Oracle. It is now a “fully grown adult” packed with new features, and improvements to existing features, as described here.

This blog post focuses on the impact of replication performance improvements that went into MySQL 8.0. This is another long post, but I hope some people will enjoy the details.

1. What is new in MySQL 8.0 replication performance?

MySQL 8.0 introduced many features related to replication, and quite a few of them are performance or monitoring improvements, as the following list shows.

We have posted blogs about the these features while they were being developed, please be sure to check the posts at mysqlhighavailability.com for more information.

2. Impact of the Performance Improvements

The next sections present the performance impact of some of the improvements in MySQL 8.0 compared to MySQL 5.7. The tests were performance with the Sysbench benchmarks in the same setup as presented in this post, please check the details there.

2.1 Higher replication throughput at low-concurrency

When replicating workloads with a small number of sessions on the master, or on systems with very fast storage, the slave applier may be unable to catch up with the master due to insufficient parallelism on the replication stream. The reason for this is that the master tags two transactions as independent if they hold simultaneously all their locks, and that becomes less effective if there only a few transactions in transit or if the group commit size is small.

With MySQL 8.0 a new mechanism was introduced to track independent transactions, one which uses the WRITESET of each transaction to infer which transactions can be executed in parallel in the slave. Tracking dependencies this way no longer depends on any particular execution order on the master, and can even execute in parallel single-threaded workloads. More information is available in Improving the Parallel Applier with Writeset-based Dependency Tracking.

The following charts show the slave applier throughput when catching up on a Sysbench Update Index workload using the WRITESET dependency tracking (MySQL 8.0) and the older scheme that is now called COMMIT_ORDER (MySQL 5.7) .

The charts show that, when using MySQL 8.0 with WRITESET dependency tracking, the slave applier has much higher throughput at low concurrency in Sysbench Update Index, with a 300% increase at 8 client sessions without durability, only coming close after more then 64 client sessions are used.

2.2 Reduced recovery time on Group Replication

One of the implications of using WRITESET dependency tracking on the master is that Group Replication can now recover faster when joining a group, as it uses asynchronous replication to reach the same state as the other members of the group. Note that Group Replication already used the WRITESET dependency tracking after joining the group (actually, this is where the feature came up).

The following tests shows how long a node takes to join a group when the user workload on the writer-node is constant at around 33% and 66% of the server capacity for that workload, using COMMIT_ORDER (MySQL 5.7) and WRITESET (MySQL 8.0) dependency tracking.

The results show that recovery is now almost 8 times faster when using Sysbench RW and almost 3 times faster when using Sysbench Update Index at 66% capacity.

That means that after initial setup a joining member can be up and running much faster when WRITESET dependency tracking is used.

2.3. Improved scalability of Group Replication

The maximum sustained throughput of Group Replication has also increased significantly, bringing it close to asynchronous replication (check session 3.1 below). There are several small improvements to help on that, but one of the main reasons is the improved coordination around the relay log as described in No new options, no new commands… Just faster at full load, where it counts!

The following tests show the sustained throughput (so, the total throughput of the system and not just on the master) of write-only Sysbench benchmarks (Update Index and Write-only) on MySQL 5.7 and MySQL 8.0, using both durable and non-durable settings.

The charts show that there is a significant throughput gain in MySQL 8.0, around 40% with durable setting and 60% with non-durable settings.

2.4 More efficient JSON replication

Another area with significant gains is on replicating JSON documents more efficiently. It is now possible to send only the bytes changed instead of the full document when only a small part of the documents is changed. More detailed information is available in Efficient JSON Replication in MySQL 8.0.

The impact of this change depends on the workload, but it is particularly beneficial when there are small updates to large JSON documents. To measure this we developed a specialized benchmark based on Sysbench RW with a table with JSON fields, changing around 10% of those in each update.

The following charts show the size of the binary log and the master and slave throughput when applying the transactions from that specialized benchmark.

The binary log size is reduced by 50% when using partial JSON replication with binlog-images=FULL, given that the before image is still complete. With binlog-images=MINIMAL alone the binary log size is reduced 5 fold, given that the rows are large and only one field is changed at any time. But once partial JSON is also active, the size drops dramatically by another 18 fold, to reach a binary log that is almost 100 times smaller.

Enabling partial JSON replication on this benchmark carries a loss on the master throughput, but using binlog-images=MINIMAL implies a gain in both cases, so actually enabling both deliver about the same throughput on the master as having both disable. On the slave throughput, there is always a gain from using partial JSON (around 10%), much larger if minimal-images are also used (40%).

3. Additional comparisons

In addition to presenting the benefits from MySQL 8.0 compared to MySQL 5.7, the following sections present some additional comparisons that you may find interesting.

3.1. Asynchronous vs Group Replication throughput

Group Replication is a complex system built on top of MySQL replication. In terms of performance, one of our main goals was to minimize the overhead that was expected compared to asynchronous replication, due to the extra work that has to be done.

To show the status on that front, the following charts presents the throughput on the slave when using asynchronous replication with both COMMIT_ORDER and WRITESET dependency tracking, and Group Replication.

As it can be seen, the sustained throughput of Group Replication is now much closer to asynchronous replication. In fact, with COMMIT_ORDER dependency tracking on asynchronous replication the sustained throughput of Group Replication is actually higher in most cases.

But with WRITESET dependency tracking asynchronous replication still has an edge at lower thread counts, as there is not enough parallelism to hide the added latency from reaching agreement on the network.

3.2. Group Replication on Local vs Wide Area Networks

In the MySQL reference manual one can see that “Group Replication is designed to be deployed in a cluster environment where server instances are very close to each other, and is impacted by both network latency as well as network bandwidth“.

The reality is that Group Replication may be used on high latency networks (WANs) as long as applications withstand the network latencies involved. The system behaviour depends on intricate details of the Paxos group communication system and the actual bandwidth and latencies between the members.

The following charts show the transaction throughput and latency currently achieved using Sysbench RW and Update Index on a Group Replication group of three members, varying the RTT to one of those members between 100us (LAN), 2ms, 10ms, 40ms, 100ms and 200ms.

Throughput:

Latency:

The additional latency shifts the throughput chart to the right, as more threads are needed to hide the latency from the network. However, the maximum throughput decreases significantly as latencies increase beyond 10ms.

Regarding latency, at 2ms to 10ms we see around 1-2 RTT, while higher latencies can reach up to 4 RTT. This is an interesting problem space where some possible GR optimizations have been identified.

3.3. Single vs Multi-master Group Replication throughput

One of Group Replication main features is the ability to have MySQL servers collaborate and build a true multi-master architecture, where clients can write to the database through any of the members of the server group. Although not recommended due to some limitations, advanced users may acknowledge that and deploy GR in multi-master mode.

A frequent doubt is if using multi-master is good for performance, considering the total write capacity of the system does not increase by using multi-master on GR, as all members need to apply the same inserts/deletes/updates in the workload.

The performance benefit from using multi-master for performance depends a lot on the characteristics of the user workload, and usually it becomes larger when transactions have a large read portion and write to independent areas of the database.

The following charts show the throughput when Sysbench clients are directed to a single member, or two members, in a three member group. Given the differences, the charts include both durable and non-durable configurations.

As it can be seen, there is some benefit from using multi-master on Sysbench RW (up to 40%), mainly when the server is reaching its capacity and it becomes beneficial to send transactions to multiple servers to split the read load between them.

On Sysbench Update Index and Write-only, workloads that have only write operations, there is still some benefit of using multi-master at high concurrency but there is also a penalty when the number of client threads is small.

4. Good news for MySQL 5.7 users also

WRITESET dependency tracking affects the slave throughput, but to use this feature only the master has to have this feature enabled, independently of the slaves are running MySQL 8.0 or MySQL 5.7.

Until 5.7.22, that meant using a MySQL 8.0 master, but given the benefits on workloads with low-concurrency on the master and on Group Replication recovery, it was decided to back-port this feature to MySQL 5.7.

So, if you have a MySQL 5.7 deployment and can’t migrate to MySQL 8.0, particularly when Group Replication is being used, be sure to check this new feature to improve recovery time and eliminate slave lag due to applier performance (you may want to revisit the charts in 2.1 and 2.2 now…).

5. Conclusion

MySQL 8.0 has replication performance improvements that translate to: i) higher throughput on the slave applier, ii) improved scalability and recovery time for group replication, iii) less space/bandwidth when replicating large JSON documents and iv) better monitoring of the replication topology.

A Complete solution for Database administration

Tuesday 17 September 2019

Performance Improvements in MySQL 8.0 Replication