Planet MySQL

Planet MySQL - https://planet.mysql.com
  1. Database migrations don’t scale well. Typically you need to perform a great deal of tests before you can pull the trigger and switch from old to new. Migrations are usually done manually, as most of the process does not lend itself to automation. But that doesn’t mean there is no room for automation in the migration process. Imagine setting up a number of nodes with new software, provisioning them with data and configuring replication between old and new environments by hand. This takes days. Automation can be very useful when setting up a new environment and provisioning it with data. In this blog post, we will take a look at a very simple migration - from standalone Percona Server 5.7 to a 3-node Percona XtraDB Cluster 5.7. We will use Ansible to accomplish that. Environment Description First of all, one important disclaimer - what we are going to show here is only a draft of what you might like to run in production. It does work on our test environment but it may require modifications to make it suitable for your environment. In our tests we used four Ubuntu 16.04 VM’s deployed using Vagrant. One contains standalone Percona Server 5.7, remaining three will be used for Percona XtraDB Cluster nodes. We also use a separate node for running ansible playbooks, although this is not a requirement and the playbook can also be executed from one of the nodes. In addition, SSH connectivity is available between all of the nodes. You have to have connectivity from the host where you run ansible, but having the ability to ssh between nodes is useful (especially between master and new slave - we rely on this in the playbook). Playbook Structure Ansible playbooks typically share common structure - you create roles, which can be assigned to different hosts. Each role will contain tasks to be executed on it, templates that will be used, files that will be uploaded, variables which are defined for this particular playbook. In our case, the playbook is very simple. . ├── inventory ├── playbook.yml ├── roles │ ├── first_node │ │ ├── my.cnf.j2 │ │ ├── tasks │ │ │ └── main.yml │ │ └── templates │ │ └── my.cnf.j2 │ ├── galera │ │ ├── tasks │ │ │ └── main.yml │ │ └── templates │ │ └── my.cnf.j2 │ ├── master │ │ └── tasks │ │ └── main.yml │ └── slave │ └── tasks │ └── main.yml └── vars └── default.yml We defined a couple of roles - we have a master role, which is intended to do some sanity checks on the standalone node. There is slave node, which will be executed on one of the Galera nodes to configure it for replication, and set up the asynchronous replication. Then we have a role for all Galera nodes and a role for the first Galera node to bootstrap the cluster from it. For Galera roles, we have a couple of templates that we will use to create my.cnf files. We will also use local .my.cnf to define a username and password. We have a file containing a couple of variables which we may want to customize, just like passwords. Finally we have an inventory file, which defines hosts on which we will run the playbook, we also have the playbook file with information on how exactly things should be executed. Let’s take a look at the individual bits. Inventory File This is a very simple file. [galera] 10.0.0.142 10.0.0.143 10.0.0.144 [first_node] 10.0.0.142 [master] 10.0.0.141 We have three groups, ‘galera’, which contains all Galera nodes, ‘first_node’, which we will use for the bootstrap and finally ‘master’, which contains our standalone Percona Server node. Playbook.yml The file playbook.yml contains the general guidelines on how the playbook should be executed. - hosts: master gather_facts: yes become: true pre_tasks: - name: Install Python2 raw: test -e /usr/bin/python || (apt -y update && apt install -y python-minimal) vars_files: - vars/default.yml roles: - { role: master } As you can see, we start with the standalone node and we apply tasks related to the role ‘master’ (we will discuss this in details further down in this post). - hosts: first_node gather_facts: yes become: true pre_tasks: - name: Install Python2 raw: test -e /usr/bin/python || (apt -y update && apt install -y python-minimal) vars_files: - vars/default.yml roles: - { role: first_node } - { role: slave } Second, we go to node defined in ‘first_node’ group and we apply two roles: ‘first_node’ and ‘slave’. The former is intended to deploy a single node PXC cluster, the later will configure it to work as a slave and set up the replication. - hosts: galera gather_facts: yes become: true pre_tasks: - name: Install Python2 raw: test -e /usr/bin/python || (apt -y update && apt install -y python-minimal) vars_files: - vars/default.yml roles: - { role: galera } Finally, we go through all Galera nodes and apply ‘galera’ role on all of them. Severalnines   DevOps Guide to Database Management Learn about what you need to know to automate and manage your open source databases Download for Free Variables Before we begin to look into roles, we want to mention default variables that we defined for this playbook. sst_user: "sstuser" sst_password: "pa55w0rd" root_password: "pass" repl_user: "repl_user" repl_password: "repl1cati0n" As we stated, this is a very simple playbook without much options for customization. You can configure users and passwords and this is basically it. One gotcha - please make sure that the standalone node’s root password matches ‘root_password’ here as otherwise the playbook wouldn’t be able to connect there (it can be extended to handle it but we did not cover that). This file is without much of a value but, as a rule of thumb, you want to encrypt any file which contains credentials. Obviously, this is for the security reasons. Ansible comes with ansible-vault, which can be used to encrypt and decrypt files. We will not cover details here, all you need to know is available in the documentation. In short, you can easily encrypt files using passwords and configure your environment so that the playbooks can be decrypted automatically using password from file or passed by hand. Roles In this section we will go over roles that are defined in the playbook, summarizing what they are intended to perform. Master role As we stated, this role is intended to run a sanity check on the configuration of the standalone MySQL. It will install required packages like percona-xtrabackup-24. It also creates replication user on the master node. A configuration is reviewed to ensure that the server_id and other replication and binary log-related settings are set. GTID is also enabled as we will rely on it for replication. First_node role Here, the first Galera node is installed. Percona repository will be configured, my.cnf will be created from the template. PXC will be installed. We also run some cleanup to remove unneeded users and to create those, which will be required (root user with the password of our choosing, user required for SST). Finally, cluster is bootstrapped using this node. We rely on the empty ‘wsrep_cluster_address’ as a way to initialize the cluster. This is why later we still execute ‘galera’ role on the first node - to swap initial my.cnf with the final one, containing ‘wsrep_cluster_address’ with all the members of the cluster. One thing worth remembering - when you create a root user with password you have to be careful not to get locked off MySQL so that Ansible could execute other steps of the playbook. One way to do that is to provide .my.cnf with correct user and password. Another would be to remember to always set correct login_user and login_password in ‘mysql_user’ module. Slave role This role is all about configuring replication between standalone node and the single node PXC cluster. We use xtrabackup to get the data, we also check for executed gtid in xtrabackup_binlog_info to ensure the backup will be restored properly and that replication can be configured. We also perform a bit of the configuration, making sure that the slave node can use GTID replication. There is a couple of gotchas here - it is not possible to run ‘RESET MASTER’ using ‘mysql_replication’ module as of Ansible 2.7.10, it should be possible to do that in 2.8, whenever it will come out. We had to use ‘shell’ module to run MySQL CLI commands. When rebuilding Galera node from external source, you have to remember to re-create any required users (at least user used for SST). Otherwise the remaining nodes will not be able to join the cluster. Galera role Related resources  ClusterControl for MySQL Galera Cluster  How to Automate Daily DevOps Database Tasks with Chef  Infrastructure Automation - Deploying ClusterControl and MySQL-based systems on AWS using Ansible Finally, this is the role in which we install PXC on remaining two nodes. We run it on all nodes, the initial one will get “production” my.cnf instead of its “bootstrap” version. Remaining two nodes will have PXC installed and they will get SST from the first node in the cluster. Summary As you can see, you can easily create a simple, reusable Ansible playbook which can be used for deploying Percona XtraDB Cluster and configuring it to be a slave of standalone MySQL node. To be honest, for migrating a single server, this will probably have no point as doing the same manually will be faster. Still, if you expect you will have to re-execute this process a couple of times, it will definitely make sense to automate it and make it more time efficient. As we stated at the beginning, this is by no means production-ready playbook. It is more of a proof of concept, something you may extend to make it suitable for your environment. You can find archive with the playbook here: http://severalnines.com/sites/default/files/ansible.tar.gz We hope you found this blog post interesting and valuable, do not hesitate to share your thoughts. Tags:  MySQL galera cluster automation migration galera ansible
  2. With the release of version 8.0.16 of MySQL, we’ve added some features to the Group Replication (GR) plugin, in order to enhance its high-availability capacities. One of these features was the ability to enable a member that has left the group, under certain circumstances, to automatically rejoin it, without need of intervention by the user.…
  3. I recently stumbled upon a very interesting post by Lukas Eder, where he describes 10 query transformations which do not depend on the database’s cost model. He posted it a couple of years ago, though when I read it, I assumed some portions of it may still be relevant today. In the original post, several databases were tested to see if their internal optimizer will be able to automatically re-write the SQL queries and optimize them. In those tests, MySQL under-performed in several of the use cases (the tested version was MySQL 8.0.2, which was released on 2017-07-17). Seeing those results, and given the previous evaluation was done almost two years ago, I thought that now can be a good chance to re-evaluate a few of those tests with the latest MySQL 8.0.16 (released on 2019-04-25), and demonstrate EverSQL Query Optimizer‘s capabilities while we are at it. We’ll use the Sakila sample database for the demonstrations, and some of the original queries from Lukas’s post. The following two tables will be used for these demonstrations: CREATE TABLE address ( address_id INT NOT NULL, address VARCHAR(50) NOT NULL, CONSTRAINT pk_address PRIMARY KEY (address_id) ); CREATE TABLE customer ( customer_id INT NOT NULL, first_name VARCHAR(45) NOT NULL, last_name VARCHAR(45) NOT NULL, address_id INT NOT NULL, CONSTRAINT pk_customer PRIMARY KEY (customer_id), CONSTRAINT fk_customer_address FOREIGN KEY (address_id) REFERENCES address(address_id) ); CREATE TABLE actor ( actor_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT, first_name VARCHAR(45) NOT NULL, last_name VARCHAR(45) NOT NULL, last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (actor_id), KEY idx_actor_last_name (last_name) )ENGINE=InnoDB DEFAULT CHARSET=utf8; JOIN Elimination Looking at the query below, you’ll probably notice that the table address isn’t really used anywhere in the query (other than in the join’s ON clause), and has no actual contribution to the query. The columns are selected from the customer table, and there are no predicates for the address table (or any predicates for that matter). The existence of a PRIMARY KEY for address_id in the table address and the FOREIGN KEY for the same column in the table customer should provide the database’s optimizer the confidence to conclude that the join to the table address can be spared. SELECT c.first_name, c.last_name FROM customer c JOIN address a ON c.address_id = a.address_id Yet, as you can see from the EXPLAIN, MySQL executes the join, without applying the expected query transformation. When submitting the same query and schema structure to EverSQL, it will output the following query and recommendation: In a similar manner, MySQL 8.0.16 will execute the join for all other examples in the Join Elimination section of the original post, so I saw no reason to repeat them here. Unneeded Self JOIN In the following query, the table actor is joined to itself. It can be proven that a1 = a2, because the tables are joined using the primary key actor_id. Therefore, anything we can do with a2, can actually be done with a1 as well. Therefore, we can can modify the references to a2 in the SELECT clause to the same columns in a1, and remove the redundant join to a2. SELECT a1.first_name, a2.last_name FROM actor a1 JOIN actor a2 ON a1.actor_id = a2.actor_id; MySQL 8.0.16 will execute the join in this case as well, without applying the expected query transformation. When submitting the same query and schema structure to EverSQL, it will output the following query and recommendation: Original queries and information for this section can be found here. Predicate Pushdown We should always strive to have our SQL queries process as less data as possible, especially if the filtering is done using indexes. In the following query, we expect MySQL to push the condition from the outer query to both parts of the UNION ALL, to make sure we filter out as much data as we can, as early as possible. SELECT * FROM (SELECT first_name, last_name, 'actor' type FROM actor UNION ALL SELECT first_name, last_name, 'customer' type FROM customer) people WHERE people.last_name = 'DAVIS'; As you can see below, the transformation isn’t applied by MySQL, and both tables, actor and customer are scanned in full. Submitting the query and schema structure to EverSQL will result in the following query and recommendation: When looking at the execution plan of the optimized query, you can see that the indexes are used, and significantly less data is scanned: Wrapping up Query transformations can be very powerful, and it’s important to understand which of them will be applied automatically by the database and which won’t. In this post, we listed three examples (originally posted by Lukas Eder), in which MySQL 8.0.16 didn’t apply the expected transformations, which eventually resulted in non-optimal execution plans.
  4. Overview The Skinny In this blog post we will discuss how the managed cross-site replication streams work in a Composite Multi-Master Tungsten Cluster for MySQL, MariaDB and Percona Server. Agenda What’s Here? Briefly explore how managed cross-site replication works in a Tungsten Composite Multi-Master Cluster Describe the reasons why the default design was chosen Explain the pros and cons of changing the configuration Examine how to change the configuration of the managed cross-site replicators Cross-Site Replication A Very Brief Summary In a standard Composite Multi-Master (CMM) deployment, the managed cross-site replicators pull Transaction History Logs (THL) from every remote cluster’s current master node. The CMM functionality was introduced in Tungsten Clustering software version 6.0.0 Cross-Site Replication: In-Depth How Does It All Work? Managed cross-site replicators run in addition to the base replication service for the local cluster. The additional replication services (one per remote site) run on all nodes, and have a relay and slaves. The relay runs on the current master node so as not to make things confusing. In a Composite Multi-Master cluster, each local cluster must pull data from the other remote sites. There is an additional replication service for each remote site, called a sub-service. Each sub-service is named to match the remote site. For example, assume a composite cluster with four sites: east, west, north and south. On the east cluster, there would be the following three additional replication streams:east_from_west east_from_north east_from_south As you can see, each sub-service is named in a way that makes it easy to understand. Reading sub-service names is also simple, and so for example “east_from_west” is stated as “I am in cluster east and pulling THL from cluster west”. Below is a diagram showing just two clusters within the “usa” composite service – east and west: Cross-Site Replication Architecture Pros and Cons The default architecture is designed so that the relay node for each cluster service gets the most recent information directly from the remote master, reducing the risk of data latency (staleness). As of Tungsten Clustering software version 6.0.4, the cross-site replicators within a CMM deployment can be configured to point to slave nodes, and to prefer slave nodes over master nodes during operation. This configuration allows the slave nodes to handle the load generated by the remote cross-site relays upon the master nodes. This becomes a concern when there are many sites, all of which are pulling THL from the same remote master nodes. Of course, when using this option, one must accept that the downstream data may be delayed by the additional hop, and that the data replicated to the remote sites could (and probably would) be older than it would be using the standard topology. Tuning Cross-Site Replication How To Configure the Replication Streams to Meet Your Needs To configure the cluster to prefer slave nodes over master nodes, use the --policy-relay-from-slave=true option to tpm. Both master and slave nodes remain in the list of possible hosts, so if no slave nodes are available during a switch or failover event, then a master will be used. Summary The Wrap-Up In this blog post we discussed Tungsten Composite Multi-Master Cluster cross-site replication configuration. To learn about Continuent solutions in general, check out https://www.continuent.com/solutions The Library Please read the docs! For more information about monitoring Tungsten clusters, please visit https://docs.continuent.com. Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business! For more information, please visit https://www.continuent.com/solutions Want to learn more or run a POC? Contact us.
  5. If you require to get only one record from database table using codeigniter query then you can do it using row we can easily return one row from database in codeigniterI will give you simple example of fetch single record from database using mysql codeigniterSometime we need to get only one