Planet MySQL

Planet MySQL -
  1. In this blog post, we’ll look at how ZFS affects MySQL performance when used in conjunction. ZFS and MySQL have a lot in common since they are both transactional software. Both have properties that, by default, favors consistency over performance. By doubling the complexity layers for getting committed data from the application to a persistent disk, we are logically doubling the amount of work within the whole system and reducing the output. From the ZFS layer, where is really the bulk of the work coming from? Consider a comparative test below from a bare metal server. It has a reasonably tuned config (discussed in separate post, results and scripts here). These numbers are from sysbench tests on hardware with six SAS drives behind a RAID controller with a write-backed cache. Ext4 was configured with RAID10 softraid, while ZFS is the same (striped three pairs of mirrored VDEvs). There are a few obvious observations here, one being ZFS results have a high variance between median and the 95th percentile. This indicates a regular sharp drop in performance. However, the most glaring thing is that with write-only only workloads of update-index, overall performance could drop to 50%: Looking further into the IO metrics for the update-index tests (95th percentile from /proc/diskstats), ZFS’s behavior tells us a few more things.   ZFS batches writes better, with minimal increases in latency with larger IO size per operation. ZFS reads are heavily scattered and random – the high response times and low read IOPs and throughput means significantly higher disk seeks. If we focus on observation #2, there are a number of possible sources of random reads: InnoDB pages that are not in the buffer pool When ZFS records are updated, metadata also has to be read and updated This means that for updates on cold InnoDB records, multiple random reads are involved that are not present with filesystems like ext4. While ZFS has some tunables for improving synchronous reads, tuning them can be touch and go when trying to fit specific workloads. For this reason, ZFS introduced the use of L2ARC, where faster drives are used to cache frequently accessed data and read them in low latency. We’ll look more into the details how ZFS affects MySQL, the tests above and the configuration behind them, and how we can further improve performance from here in upcoming posts.
  2. MongoDB used to have a great story for sharded replica sets. But the storage engine, sharding and replica management code had significant room for improvement. Over the last few releases they made remarkable progress on that and the code is starting to match the story. I continue to be impressed by the rate at which they paid off their tech debt and transactions coming to MongoDB 4.0 is one more example.It is time for us to do the same in the MySQL community.I used to be skeptical about the market for sharded replica sets with MySQL. This is popular with the web-scale crowd but that is a small market. Today I am less skeptical and assume the market extends far beyond web-scale. This can be true even if the market for replicasets, without sharding, is so much larger.The market for replica sets is huge. For most users, if you need one instance of MySQL then you also need HA and disaster recovery. So you must manage failover and for a long time (before crash-proof slaves and GTID) that was a lousy experience. It is better today thanks to cloud providers and DIY solutions even if some assembly is required. Upstream is finally putting a solution together with MySQL Group Replication and other pieces. But sharded replica sets are much harder, and even more so if you want to do cross-shard queries and transactions. While there have been many attempts at sharding solutions for the MySQL community, it is difficult to provide something that works across customers. Fortunately Vitess has shown this can be done and already has many customers in production. ProxySQL and Orchestrator might also be vital pieces of this stack. I am curious to see how the traditional vendors (MySQL, MariaDB, Percona) respond to this progress.Updates:I think binlog server should be part of the solution. But for that to happen we need a GPLv2 binlog server and that has yet to be published.
  3. Today, let’s have a look at the TOP 10 new features in MySQL 8.0 that will improve DBA’s life. To shrink the list to 10 only items wasn’t an easy task, but here is the top 10: Temporary Tables Improvements Persistent global variables No more MyISAM System Tables Reclaim UNDO space from large transactions UTF8 performance Removing Query Cache Atomic DDLs Faster & More Complete Performance Schema (Histograms, Indexes, …) and Information Schema ROLES REDO & UNDO logs encrypted if tablespace is encrypted Temporary Tables Improvements Since 5.7, all internal temporary tables are created in a unique shared tablespace called “ibtmp1“. Additionally, the metadata for temp tables will also be stored in memory (not anymore in .frm files). In MySQL 8.0, the MEMORY storage engine will also be replaced as default engine for internal temporary tables (those created by the Optimizer during JOIN, UNION, …) by the TempTable storage engine. This new engine provides more efficient storage for VARCHAR and VARBINARY columns (with Memory the full maximum size is allocated). Persistent Global Variables With MySQL 8.0 it is now also possible to set variables and make the change persistent to server’s reboot. I’ve written a dedicated blog post that you can check for more information. Combined this syntax and the new RESTART command, makes very easy to configure MySQL from its shell. This is a cloud friendly feature! No more MyISAM System Tables With the new native Data Dictionary, we won’t need MyISAM system tables anymore ! Those tables and the data dictionary tables are now created in a single InnoDB tablespace file named mysql.idb in the data directory. This means that if you don’t explicitly use MyISAM tables (which is totally inadvisable if you care about your data) you can have a MySQL instance without any MyISAM table. Reclaim UNDO space from large transactions In MySQL 5.7, we already added the possibility to truncate undo spaces (innodb_undo_log_truncate, disabled by default). In MySQL8, we changed the undo disk format to support a huge number of rollback segments per undo tablespaces. Also, by default, the rollback segments are now created in two separate undo tablespaces instead of the InnoDB system tablespace (2 is now the minimum and this setting is now dynamic). We also deprecated the variable to set that value (innodb_undo_tablespaces) as we will provide SQL commands giving DBAs a real interface to interact with UNDO Tablespaces too. Automatic truncation of undo tablespaces is also now enabled by default. UTF8 Performance The default character set has changed from latin1 to utf8mb4 as UTF8 is now much faster up to 1800% faster on specific queries ! Emojis are everywhere now and MySQL supports them without problem ! Removing Query Cache The first thing I was always advising during a performance audit was to disable the Query Cache as it didn’t scale by design. The MySQL QC was creating more issues than it solved. We decided to simply remove it in MySQL 8.0 as nobody should use it. If your workload requires a Query Cache, then you should have a look to ProxySQL as Query Cache. Atomic DDLs With the new Data Dictionary, MySQL 8.0 now supports Atomic Data Definition Statements (Atomic DDLs). This means that when a DDL is performed, the data dictionary updates, the storage engine operation and the writes in the binary log are combined into a single atomic transaction that is either fully executed or not at all. This provides a better reliability where unfinished DDLs don’t leave any incomplete data. Faster & More Complete Performance Schema (Histograms, Indexes, …) and Information Schema Many improvements were made to Performance Schema like fake indexes or histograms. With the contribution of fake indexes, queries like SELECT * FROM sys.session became 30x faster. Tables scans are now avoided as much as possible and the use of indexes improves a lot the execution time. Additionally to that, Performance Schema also provides histograms of statements latency. The Optimizer can also benefit form these new histograms. Information Schema has also been improved by the use of the Data Dictionary. No more .frm files are needed to know the table’s definition. Also this allow to scale to more than 1.000.000 tables ! ROLES SQL Roles have been added to MySQL 8.0. A role is a named collection of privileges. Like user accounts, roles can have privileges granted to and revoked from them. Roles can be applicable by default or by session. There is also the possibility to set roles to be mandatory. REDO & UNDO logs encrypted if tablespace is encrypted In MySQL 5.7, it was possible to encrypt an InnoDB tablespace for tables stored in file-per-table. In MySQL 8.0 we completed this feature by adding encryption for UNDO and REDO logs too. And once again, the list of improvements doesn’t finish here. There are many other nice features. I would like to list below some other important ones (even if they are all important of course ) persistent auto increment InnoDB self tuning JSON performance Invisible Indexes new lock for backup Resource Groups additional metadata into binary logs OpenSSL for Community Edition too Please check the online manual to have more information about all these new features.
  4. Since January 2017, the MySQL Replication Team has been involved in processing many Community Contributions ! We are really happy to receive contributions (and not only in the replication team), but this also implies a lot of work from our engineers, as more than resolving a bug or developing a new feature, code contributions need to be analyzed, the code needs to be understood and validated.…
  5. With support of multi-threads replication starting from MySQL 5.7, the operations on slave are slightly different from single-thread replication. Here is a list of some operation tips for the convenience of use as below: 1. Skip a statement for a specific channel. Sometimes, we might find out that one of the channels stop replication due to some error, and we may want to skip the statement for that channel so that we can restart a slave for it. We need to be very careful not to skip the statement from the other channel, since the command SET GLOBAL sql_slave_skip_counter = N is for global. How can we make sure the global sql_slave_skip_counter is applied to a specific channel and not to the other channel? Here are the steps: 1.1: Stop all slaves by: stop slave; stop slave; 1.2: Set up the count of statement to skip by: SET GLOBAL sql_slave_skip_counter = N; SET GLOBAL sql_slave_skip_counter = 1; 1.3: Start slave on the channel we want to skip the statement on. The command will use the setting for global sql_slave_skip_counter = 1 to skip one statement and start slave on that channel (for example ‘main’) by: starting slave for channel ‘channel-name’; start slave for channel 'main'; 1.4: Start slave on all the other channels by: start slave; start slave; 2. Check the status of replication with detailed messages in the table performance_schema.replication_applier_status_by_worker through select * from the table: mysql> select * from performance_schema.replication_applier_status_by_worker; | CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP | | metrics | 1 | 1784802 | ON | ANONYMOUS | 0 | | 0000-00-00 00:00:00 | | accounting | 1 | 1851760 | ON | ANONYMOUS | 0 | | 0000-00-00 00:00:00 | | main | 1 | NULL | OFF | ANONYMOUS | 1051 | Worker 0 failed executing transaction 'ANONYMOUS' at master log mysql-bin.019567, end_log_pos 163723076; Error 'Unknown table 'example.accounts'' on query. Default database: 'pythian'. Query: 'DROP TABLE `example`.`accounts` /* generated by server */' | 2018-02-14 23:57:52 | | log | 1 | 1784811 | ON | ANONYMOUS | 0 | | 0000-00-00 00:00:00 | mysql> select * from performance_schema.replication_applier_status_by_worker; | CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP | | metrics | 1 | 1965646 | ON | ANONYMOUS | 0 | | 0000-00-00 00:00:00 | | accounting | 1 | 1965649 | ON | ANONYMOUS | 0 | | 0000-00-00 00:00:00 | | main | 1 | 1965633 | ON | ANONYMOUS | 0 | | 0000-00-00 00:00:00 | | log | 1 | 1965652 | ON | ANONYMOUS | 0 | | 0000-00-00 00:00:00 | 3. Check the status for a specific channel by: show slave status for channel ‘channel-name’\G : mysql> show slave status for channel 'main'\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: Master_User: replicator Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.019567 Read_Master_Log_Pos: 869255591 Relay_Log_File: db-test-02-relay-bin-example.000572 Relay_Log_Pos: 45525401 Relay_Master_Log_File: mysql-bin.019567 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: test.sessions,test.metrics Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 869255591 Relay_Log_Space: 869256195 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 4118338212 Master_UUID: b8cee5b1-3161-11e7-8109-3ca82a217b08 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: insight Master_TLS_Version: I hope this short list of tips helps you enjoy multi-threads replication.