Sunday 1 September 2019

MySQL online schema change

MySQL has been one of the technologies I’ve been working the most during the past years. It has been painful and powerful at the same time. Only recently I’ve started to clearly understand its potential and I can say with confidence, I can never recommend it enough as general purpose RDBMS
As you may know the main frustration with MySQL — but also other relational ACID databases — is manipulating existing schemas in production. You might want to change your existing table’s structure without causing database downtime. Technically speaking such operation is done through Data Definition Language statements (CREATE, ALTER, DROP..)
Operating a schema change in production, can be an issue mostly when the affected tables to be changed have a huge number of rows (order of Millions). The schema change could require minutes if not hours for the database to complete, due the full table restructuring

MySQL concurrency model at a glance

Modern RDBMS employ strong high-concurrency and data-consistency mechanisms. A simplistic-enough example to understand on how that happens is the readers-writers locking strategy (Two Phase Locking) and works as it follows
InnoDB is one of the default MySQL internal storage engines and supports a combination of 2PL (Two Phase Locking) and MVCC (Multi-Version Concurrency Control). Thanks to MVCC, in case of two concurrent write operations, contention can be removed — as opposed to what happens in the Write Lock scenario — thus allowing multiple writes to happen concurrently without locking. This will cause the transactions to operate on two different versions of the row, causing a conflict that will be resolved later on
In MySQL jargon, read operations are often defined as DQL (Data Query Language) and write operations are DML (Data Manipulation Language)Remember these acronyms, ðŸ™‡‍ you will see them again in the rest of the article

Why are DDL statements so expensive?

DDL Statements like an ALTER TABLE will cause contention and block writers (DML operations) to complete. Take a look at the Write Lock we have seen before. You can also have a read to the official MySQL docs for better understanding 📚
An ALTER TABLE operation run with the ALGORITHM=COPY clause prevents concurrent DML operations. Concurrent queries are still allowed. That is, a table-copying operation always includes at least the concurrency restrictions of LOCK=SHARED (allow queries but not DML). You can further restrict concurrency for such operations by specifying LOCK=EXCLUSIVE, which prevents DML and queries
As the official docs tells us, we can either allow Reads (DQL) and block Writes (DML) (Shared Lock) or block both (Exclusive Lock)

A game changer — MySQL Online DDL

Starting from MySQL 5.6, InnoDB Online DDL was introduced, specifically to solve the problem above and
  • Allows a more granular control over the locking strategy
  • Ability to adjust the balance between performance and concurrency during a DDL operation (ALTER, CREATE, DROP…)
  • In-place operations allowing less disk I/O due to the table-copying operations caused by DDL statements
And most importantly all of this translate into the ability of running MySQL online schema statements in production with major improvement on database load and efficiency

ALTER-ing a MySQL Table without downtime

Now it should be clear what happens when you try to alter a huge MySQL table in a high-concurrent environment. It will block all DML statementsby default (INSERT, UPDATE, DELETE..)
ALTER TABLE `user` ADD COLUMN `phone` INT(15) NOT NULL;
Thanks to the new InnoDB Online DDL feature, we can now properly alter the table without causing a major downtime, by specifying the right — favourable — locking strategy
ALTER TABLE `user` ADD COLUMN `phone` INT(15) NOT NULL, ALGORITHM=INPLACE, LOCK=NONE;
  • Using LOCK=NONE permit reads and writes or LOCK=SHARED permit reads
  • You can also compare performance using ALGORITHM=INPLACE or ALGORITHM=COPY
With this setting, concurrent DML operations (INSERT, UPDATE..) are not blocked. Does this mean we can do any kind of MySQL Online Schema Migration without any downtime? Not Really, limits exist

ALTER-ing tables Foreign Key Constraints

In the example above we’ve seen a simple ADD COLUMN operation. What if we wanted to add a new foreign key constraint to a table, using the InnoDB Online DDL?
ALTER TABLE `subscription` ADD CONSTRAINT `fk_user_id` FOREIGN KEY (`user_id`) REFERENCES `user`(id), ALGORITHM=INPLACE, LOCK=NONE;
That wouldn’t work 😞
Indeed InnoDB checks foreign key constraints immediately; the check is not deferred to transaction commitThe only way to run the alter table without locking is to try again the statement, just after having set
SET foreign_key_checks=0;
ALTER TABLE `subscription` ADD CONSTRAINT `fk_user_id` FOREIGN KEY (`user_id`) REFERENCES `user`(id), ALGORITHM=INPLACE, LOCK=NONE;
SET foreign_key_checks=1;
Only if you feel comfortable trusting your application logic to handle foreign key validation for the duration of your alter table. Quite riskyotherwise! ⚠️ Also this is a session variable so remember to set it back onafter you’ve done

Understanding InnoDB Online DDL limitations

As specified in the official docs, a DDL statement using the LOCK=NONE to enable concurrent DML statements can be halted by the database in case the level of concurrency is not available, safely rolling back the entire operation. While a DDL statement is altering the table structure, concurrent DML statements that write to it, are not being blocked but rather recorded into a log file, which is used after to replay those statements. The size of this file is limited, therefore limiting the number of concurrent DML operations
The log file max size is specified by the InnoDB system variable innodb_online_alter_log_max_sizeYou can increase it, but bigger log file means, more DML statements to replay onto the table. It’ll result in more time for the table being locked after such operation. Just find the balance

1 comment:

  1. I found devart unidac, this tool helps to work with MySQL db.

    ReplyDelete

The complete list of new features in MySQL 8.0

There are over 250 new features in MySQL 8.0. The MySQL Manual is very good, but verbose. This is a list of new features in short bullet f...