MySQL has been one of the technologies I’ve been working the most during the past years. It has been painful and powerful at the same time. Only recently I’ve started to clearly understand its potential and I can say with confidence, I can never recommend it enough as general purpose RDBMS
As you may know the main frustration with MySQL — but also other relational ACID databases — is manipulating existing schemas in production. You might want to change your existing table’s structure without causing database downtime. Technically speaking such operation is done through Data Definition Language statements (CREATE, ALTER, DROP..)
Operating a schema change in production, can be an issue mostly when the affected tables to be changed have a huge number of rows (order of Millions). The schema change could require minutes if not hours for the database to complete, due the full table restructuring
MySQL concurrency model at a glance
Modern RDBMS employ strong high-concurrency and data-consistency mechanisms. A simplistic-enough example to understand on how that happens is the readers-writers locking strategy (Two Phase Locking) and works as it follows
- A Read Lock blocks writers, allowing readers to read (MySQL Shared Lock)
- A Write Lock blocks writers and readers, causing the writers to proceed after the lock is released (MySQL Exclusive Lock)
InnoDB is one of the default MySQL internal storage engines and supports a combination of 2PL (Two Phase Locking) and MVCC (Multi-Version Concurrency Control). Thanks to MVCC, in case of two concurrent write operations, contention can be removed — as opposed to what happens in the Write Lock scenario — thus allowing multiple writes to happen concurrently without locking. This will cause the transactions to operate on two different versions of the row, causing a conflict that will be resolved later on
In MySQL jargon, read operations are often defined as DQL (Data Query Language) and write operations are DML (Data Manipulation Language)Remember these acronyms, 🙇 you will see them again in the rest of the article
Why are DDL statements so expensive?
DDL Statements like an ALTER TABLE will cause contention and block writers (DML operations) to complete. Take a look at the Write Lock we have seen before. You can also have a read to the official MySQL docs for better understanding 📚
AnALTER TABLE
operation run with theALGORITHM=COPY
clause prevents concurrent DML operations. Concurrent queries are still allowed. That is, a table-copying operation always includes at least the concurrency restrictions ofLOCK=SHARED
(allow queries but not DML). You can further restrict concurrency for such operations by specifyingLOCK=EXCLUSIVE
, which prevents DML and queries
As the official docs tells us, we can either allow Reads (DQL) and block Writes (DML) (Shared Lock) or block both (Exclusive Lock)
A game changer — MySQL Online DDL
Starting from MySQL 5.6, InnoDB Online DDL was introduced, specifically to solve the problem above and
- Allows a more granular control over the locking strategy
- Ability to adjust the balance between performance and concurrency during a DDL operation (ALTER, CREATE, DROP…)
- In-place operations allowing less disk I/O due to the table-copying operations caused by DDL statements
And most importantly all of this translate into the ability of running MySQL online schema statements in production with major improvement on database load and efficiency
ALTER-ing a MySQL Table without downtime
Now it should be clear what happens when you try to alter a huge MySQL table in a high-concurrent environment. It will block all DML statementsby default (INSERT, UPDATE, DELETE..)
ALTER TABLE `user` ADD COLUMN `phone` INT(15) NOT NULL;
Thanks to the new InnoDB Online DDL feature, we can now properly alter the table without causing a major downtime, by specifying the right — favourable — locking strategy
ALTER TABLE `user` ADD COLUMN `phone` INT(15) NOT NULL, ALGORITHM=INPLACE, LOCK=NONE;
- Using
LOCK=NONE
permit reads and writes orLOCK=SHARED
permit reads - You can also compare performance using
ALGORITHM=INPLACE
orALGORITHM=COPY
With this setting, concurrent DML operations (INSERT, UPDATE..) are not blocked. Does this mean we can do any kind of MySQL Online Schema Migration without any downtime? Not Really, limits exist
ALTER-ing tables Foreign Key Constraints
In the example above we’ve seen a simple
ADD COLUMN
operation. What if we wanted to add a new foreign key constraint to a table, using the InnoDB Online DDL?ALTER TABLE `subscription` ADD CONSTRAINT `fk_user_id` FOREIGN KEY (`user_id`) REFERENCES `user`(id), ALGORITHM=INPLACE, LOCK=NONE;
That wouldn’t work 😞
Indeed InnoDB checks foreign key constraints immediately; the check is not deferred to transaction commit. The only way to run the alter table without locking is to try again the statement, just after having set
SET foreign_key_checks=0;
ALTER TABLE `subscription` ADD CONSTRAINT `fk_user_id` FOREIGN KEY (`user_id`) REFERENCES `user`(id), ALGORITHM=INPLACE, LOCK=NONE;
SET foreign_key_checks=1;
Only if you feel comfortable trusting your application logic to handle foreign key validation for the duration of your alter table. Quite riskyotherwise! ⚠️ Also this is a session variable so remember to set it back onafter you’ve done
Understanding InnoDB Online DDL limitations
As specified in the official docs, a DDL statement using the
LOCK=NONE
to enable concurrent DML statements can be halted by the database in case the level of concurrency is not available, safely rolling back the entire operation. While a DDL statement is altering the table structure, concurrent DML statements that write to it, are not being blocked but rather recorded into a log file, which is used after to replay those statements. The size of this file is limited, therefore limiting the number of concurrent DML operations
The log file max size is specified by the InnoDB system variable innodb_online_alter_log_max_size. You can increase it, but bigger log file means, more DML statements to replay onto the table. It’ll result in more time for the table being locked after such operation. Just find the balance
I found devart unidac, this tool helps to work with MySQL db.
ReplyDelete