Planet MySQL

Planet MySQL - https://planet.mysql.com
  1. One of the humbling things about working at Oracle with the various MySQL personnel is that you are often blown away by something one of them says or does.  And that is on a regular basis.  In this case it is Dr. Charles Bell who gave a great series of presentations last June at the Southeast Linuxfest. In particular he presented in a full formed state some ideas that had been rattling around in my  skull (but no way near as coherent) on how to take advantage of the MySQL JSON data type.  Below are his points from his slide deck.  I was reviewing my notes from his presentation when I realized that this information really needs to be more widely disseminated.   And I would like your feedback on these ideas? 1.. We can use a JSON field to eliminate one of the issues of traditional database solutions: many-to-many-joins This allows more freedom to store unstructured data (data with pieces missing) You still use SQL to work with the data via a database connector but the JSON documents in the table can be manipulated directly in code. Joins can be expensive. Reducing how many places you need to join data can help speed up your queries.  Removing joins may result in some level of denormalization but can result in fast access to the data. 2. Plan For Mutability Schemaless designs are focused on mutability. Build your applications with the ability to modify the document as needed (and within reason) 3. Remove Many-to-Many Relationships Use embedded arrays and lists to store relationships among documents.  This can be as simple as embedding the data in the document or embedding an array of document ids in the document. In the first case data is available as soon as you can read the document and in the second it only takes one additional step to retrieve the data. In cases of seldom read (used) relationships, having the data linked with an array of ids can be more efficient (less data to read on the first pass) This presentation and others from the Southeast Linuxfest will be available online and I will be sure to post about that when it happens. And a big thank you to Dr. Chuck for these ideas.
  2. MariaDB Connector/Node.js First Alpha Now Available diego Dupin Fri, 07/20/2018 - 09:33 MariaDB is pleased to announce the immediate availability of MariaDB Connector/Node.js alpha version 0.7.0. This is a non-blocking MariaDB client for Node.js, 100 percent JavaScript, compatible with Node.js 6+. Why a new client? While there are existing clients that work with MariaDB, (such as the mysql and mysql2 clients), the MariaDB Node.js Connector offers new functionality, like insert Streaming and Pipelining while making no compromises on performance. Insert Streaming Using a Readable stream in your application, you can stream INSERT statements to MariaDB through the Connector. https.get('https://someContent', readableStream => { //readableStream implement Readable, driver will stream data to database connection.query("INSERT INTO myTable VALUE (?)", [readableStream]); }); Pipelining With Pipelining, the Connector sends commands without waiting for server results, preserving order. For instance, consider the use of executing two INSERT statements. The Connector doesn't wait for query results before sending the next INSERT statement. Instead, it sends queries one after the other, avoiding much of the network latency. Quick Start The MariaDB Connector is available through the Node.js repositories. You can install it using npm. $ npm install mariadb Using ECMAScript 2017: const mariadb = require('mariadb'); const pool = mariadb.createPool({host: 'mydb.com', user:'myUser', connectionLimit: 5}); async function asyncFunction() { let conn; try { conn = await pool.getConnection(); const rows = await conn.query("SELECT 1 as val"); console.log(rows); //[ {val: 1}, meta:... ] const res = await conn.query("INSERT INTO myTable value (?, ?)", [1, "mariadb"]); console.log(res); // { affectedRows: 1, insertId: 1, warningStatus: 0 } } catch (err) { throw err; } finally { if (conn) return conn.end(); } } Documentation can be found on the  MariaDB knowledge base and sources are on GitHub. Benchmarks Comparing the MariaDB Connector with other Node.js clients: promise-mysql version 3.3.1 + mysql version 2.15.0 mysql2 version 1.5.3 promise-mysql : 1,366 ops/sec ±1.42% mysql2 : 1,469 ops/sec ±1.63% mariadb : 1,802 ops/sec ±1.19% Benchmarks for the MariaDB Node.js Connector are done using the popular benchmark.js package. You can find the source code for our benchmarks in the benchmarks/ folder. Roadmap The MariaDB Node.js Connector remains in development. This is an alpha release so we do not recommend using it in production. Below is a list of features being developed for future connector releases. PoolCluster MariaDB ed25519 plugin authentication Query Timeouts Bulk Insertion, (that is, fast batch). Download the MariaDB Node.js Connector directly. MariaDB is pleased to announce the immediate availability of MariaDB Connector/Node.js 0.7.0, which is the first alpha version. See the release notes and changelog for details. Login or Register to post comments
  3. Welcome to the third part of this series. I’m glad you’re still reading, as hopefully this means you find this subject interesting at least. Previously we presented the first two components of MySQL InnoDB Cluster: Group Replication and MySQL Router and now we will discuss the last component, MySQL Shell. MySQL Shell This is the last component in the cluster and I love it. Oracle have created this tool to centralize cluster management, providing a friendly, command-line based user interface. The tool can be defined as an advanced MySQL shell, which is much more powerful than the well known MySQL client. With the capacity to work with both relational and document (JSON) data, the tool provides an extended capability to interact with the database from a single place. MySQL Shell is also able to understand different languages: JavaScript (default) which includes several built-in functions to administer the cluster—create, destroy, restart, etc.—in a very easy way. Python it provides an easy way to write Python code to interact with the database. This is particularly useful for developers who don’t need to have SQL skills or run applications to test code. SQL to work in classic mode to query database as we used to do with the old MySQL client. A very interesting feature provided with MySQL Shell is the ability to establish different connections to different servers/clusters from within the same shell. There is no need to exit to connect to a different server, just issuing the command \connect will make this happen. As DBA, I find this pretty useful when handling multiple clusters/servers. Some of the features present in this tool: Capacity to use both Classic and X protocols. Online switch mode to change languages (JavaScript, Python and SQL) Auto-completion of commands using tab, a super expected feature in MySQL client. Colored formatting output that also supports different formats like Table, Tab-separated and Json formats. Batch mode that processes batches of commands allowing also an interactive mode to print output according each line is processed. Some sample commands Samples of new tool and execution modes: #switch modes \sql \js \py #connect to instance \connect user@host:[port] #create a cluster (better to handle through variables) var cluster=dba.createCluster('percona') #add instances to cluster cluster.addInstance(‘root@192.168.70.2:3306’) #check cluster status cluster.status() #using another variable var cluster2=dba.getCluster(‘percona’) cluster.status() #get cluster structure cluster.describe() #rejoin instance to cluster - needs to be executed locally to the instance cluster.rejoinInstance() #rejoin instance to cluster - needs to be executed locally to the instance cluster.rejoinInstance() #recover from lost quorum cluster.forceQuorumUsingPartitionOf(‘root@localhost:3306’) #recover from lost quorum cluster.rebootClusterFromCompleteOutage() #destroy cluster cluster.dissolve({force:true}); Personally, I think this tool is a very good replacement for the classic MySQL client. Sadly, mysql-server installations do not include MySQL shell by default, but it is worth getting used to. I recommend you try it. Conclusion We finally reached the end of this series. I hope you have enjoyed this short introduction to what seems to be Oracle’s bid to have a built-in High Availability solution based on InnoDB. It may become a good competitor to Galera-based solutions. Still, there is a long way to go, as the tool was only just released as GA (April 2018). There are a bunch of things that need to be addressed before it becomes consistent enough to be production-ready. In my personal opinion, it is not—yet. Nevertheless, I think it is a great tool that will eventually be a serious player in the HA field as it’s an excellent, flexible and easy to deploy solution. The post InnoDB Cluster in a Nutshell Part 3: MySQL Shell appeared first on Percona Database Performance Blog.
  4. In this blog post, we talk about the basic function and features of the Tungsten Connector. The Tungsten Connector is an intelligent MySQL proxy that provides key high-availability and read-scaling features. This includes the ability to route MySQL queries by inspecting them in-flight. The most important function of the Connector is failover handling. When the cluster detects a failed master because the MySQL server port is no longer reachable, the Connectors are signaled and traffic is re-routed to the newly-elected Master node. Next is the ability to route MySQL queries based on various factors. In the default Bridge mode, traffic is routed at the TCP layer, and read-only queries must be directed to a different port (normally 3306 for writes and 3307 for reads). There are additional modes, Proxy/Direct and Proxy/SmartScale. In both cases, queries are intercepted and inspected by the Connector. The decisions made are tunable based on configuration parameters. MySQL must be configured for autocommit=1 (MySQL Docs) for this to work. Additionally, any query transaction wrapped with BEGIN and COMMIT will be sent to the Master directly. In Proxy/Direct mode, a SELECT-only statement that does no writes will be sent to a read slave automatically. Unlike SmartScale, Direct routing pays no attention to the session state, or replicated data consistency. This means that performing a write and immediately trying to read the information through a Direct routing connection may fail, because the Connector does not ensure that the written transaction exists on the selected slave. Direct routing is therefore ideal in applications where: Applications perform few writes, but a high number of reads. High proportion of reads on ‘old’ data. For example, blogs, stores, or machine logging information In Proxy/SmartScale mode, an additional check is made on the read slave to determine data “staleness”. In this read-write splitting mode, the Connector intelligently determines if slaves are up-to-date with respect to the master, and selects them in such a way that reads are always strictly consistent with the last write of their current session. This is extremely useful when doing read-behind-write operations. Configuration of Proxy and SmartScale modes relies upon a text file called user.map which, at it’s most basic, defines the users that are connecting through. Each user has three required and one optional field:{user} {password} {service} [affinity] For example: app_user secret global west Without at least one user entry, communications via the Connector will fail. Best of all, each Connector node may be configured differently, so that it is possible to have all of the following available at once using four separate node instances: Bridge mode read/write (very fast) Bridge mode read-only (very fast) Proxy/Direct auto-r/w splitting (slower due to inspection) Proxy/SmartScale automatic session and latency-sensitive r/w splitting (slower still due to slave status query) In summary, the Connector allows for both proper failover handing as well as a variety of ways to route MySQL queries to read slaves. In future articles, we will cover more advanced subjects like failover behavior tuning and other MySQL query routing methods, like SQL-based, port-based and hostname-based. Questions? Contact Continuent
  5. Data Streaming with MariaDB Faisal Thu, 07/19/2018 - 12:07 Big Data vs Fast Data While Big Data is being used across the globe by companies to solve their analytical problems, sometimes it becomes a hassle to extract data from a bunch of data sources, do the necessary transformation and then eventually load it into an analytical platform such as Hadoop or something else. This obviously takes time and effort, instead of a bunch of ETL jobs, MariaDB provides a data streaming solution directly from OLTP MariaDB TX to OLAP MariaDB AX. Fast real-time data streaming is achieved with the help of a CDC (Change Data Capture) framework that streams data from MariaDB to MariaDB ColumnStore using MariaDB MaxScale 2.2.x as a bin-log router. MaxScale makes use of the Binary Logs from a MariaDB TX server and steams the data directly to MariaDB AX (MariaDB ColumnStore) for analytics. This is achieved with the MaxScale CDC Connector, CDC Adapter, and the ColumnStore API package. This sounds a bit complex but in reality, it's quite simple. Here is a quick look at how the data streaming setup works: MariaDB-server → MaxScale → MaxScale-CDC-Connector → MaxScale-CDC-Adapter → ColumnStore-API → ColumnStore Database The Setup Requirements Here is a quick look at the setup that we are going to be working on, I am using Oracle Virtual Machines with CentOS 7 1x CentOS 7 VM for MariaDB TX 3.0 1x CentOS 7 VM for MaxScale as a Replication Router 1x CentOS 7 VM for CDC Adapter + Connector and ColumnStore API 1x CentOS 7 VM for ColumnStore, we will be using a single node "combined" ColumnStore setup for simplicity, refer to MariaDB Columnstore Installation Guide page for detailed setup instruction  Our Setup will look like this MariaDB (192.168.56.101) MaxScale (192.168.56.102) CDC Server (192.168.56.103) ColumnStore (192.168.56.104) Note: We will assume that you already have a MariaDB (Source of our data streaming) and ColumnStore (Data's final destination) readily available for use.  Preparing the VM OS There are a few important things that are required before we start the installations. Note: the following steps must be performed and validated on all the VMs Disable SELinux For the purposes of testing, we want SELinux disabled. Make sure that your SELinux configuration, in the file /etc/selinux/config, looks something like this on all the nodes: # This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of these two values: # targeted - Targeted processes are protected, # minimum - Modification of targeted policy. Only selected processes are protected. # mls - Multi Level Security protection. SELINUXTYPE=targeted The change here is the SELinux setting of course. After a reboot of the server, check if the SELinux has actually been disabled, use either of the two commands (sestatus/getenforce) to confirm [root@localhost ~] sestatus SELinux status: disabled [root@localhost ~] getenforce Disabled Disable firewalld Firewalld is a standard service that is disabled using the systemctl command on the REHL 7/CETOS 7. After disabling double check using systemctl status firewalld on all the nodes [root@localhost ~] systemctl stop firewalld [root@localhost ~] systemctl disable firewalld Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service. [root@localhost ~] systemctl status firewalld ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled) Active: inactive (dead) Docs: man:firewalld(1) Setup up the MariaDB account The mariadb user and group are required to be created on all the nodes. sudo privilege for the mariadb user is also mandatory. Following this, all steps will be done using mariadb user with the help of sudo unless specified differently. [root@localhost ~] groupadd mysql [root@localhost ~] useradd -g mysql mysql [root@localhost ~] echo "mysql ALL=(ALL) ALL" >> /etc/sudoers [root@localhost ~] passwd mysql Changing password for user mysql. New password: Retype new password: passwd: all authentication tokens updated successfully. Enable Networking on the VirtualBox VMs This is important so that the VMs could use Static IP addresses and also enable them to use Internet access of the host operating system. Note: Make sure that NAT and Host-Only Adapters are both enabled for all the VMs. Network setting within the VMs will need to be modified as follows for NAT: [root@localhost ~] cat /etc/sysconfig/network-scripts/ifcfg-enp0s3 TYPE=Ethernet BOOTPROTO=dhcp DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=no NAME=enp0s3 DNS1=8.8.8.8 DNS2=8.8.4.4 UUID=89bfb7a3-17c3-49f9-aaa6-4d66c47ea6fb DEVICE=enp0s3 ONBOOT=yes PEERDNS=yes PEERROUTES=yes For Host-Only Adapter edit the IP address accordingly for each VM: [root@localhost ~] cat /etc/sysconfig/network-scripts/ifcfg-enp0s8 TYPE=Ethernet BOOTPROTO=none DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=no NAME=enp0s8 PREFIX=24 IPADDR=192.168.56.101 UUID=21a845ad-8cae-4a2c-b563-164a1f8a30cf DEVICE=enp0s8 ONBOOT=yes Note: Make sure the above two files (ifcfg-enp0s3m, ifcfg-enp0s8) are changed on each Node/VM according and has it's own distinct Static IP defined in the file (ifcfg-enp0s8). Set Hostnames on each VM for a better setup, this step is not mandatory but a good practice. MariaDB VM: [root@localhost ~] hostnamectl set-hostname mariadb101 MaxScale VM: [root@localhost ~] hostnamectl set-hostname maxscale102 CDC VM: [root@localhost ~] hostnamectl set-hostname cdc103 Install MariaDB TX 3.0 Setup the Repository Login to the MariaDB Server (192.168.56.101) Setup MariaDB repository by using the Repository Setup page or use the curl interface to do this automatically for your OS or curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash Note: Not all the Linux distros are supported by the curl tool Once the repository has been set up, simply use "yum install" yum -y install MariaDB-server Take care with the Unix's case sensitivity We will be using the curl script to set up the MariaDB repositories, on production environments internet access is normally not available, in that case, we can download the RPMs externally and transfer the files to the servers using your favorite secure file transfer tools. +[root@mariadb101 ~] su - mysql [mysql@mariadb101 ~]$ sudo curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash [info] Repository file successfully written to /etc/yum.repos.d/mariadb.repo. [info] Adding trusted package signing keys... [info] Succeessfully added trusted package signing keys. [mysql@mariadb101 ~]$ sudo yum install MariaDB-server mariadb-main | 2.9 kB 00:00:00 mariadb-maxscale | 2.4 kB 00:00:00 mariadb-tools | 2.9 kB 00:00:00 (1/3): mariadb-maxscale/7/x86_64/primary_db | 6.3 kB 00:00:01 (2/3): mariadb-tools/7/x86_64/primary_db | 9.6 kB 00:00:01 (3/3): mariadb-main/7/x86_64/primary_db | 48 kB 00:00:01 Resolving Dependencies ================================================================================================ Package Arch Version Repository Size ================================================================================================ Installing: MariaDB-compat x86_64 10.3.7-1.el7.centos mariadb-main 2.8 M replacing mariadb-libs.x86_64 1:5.5.56-2.el7 MariaDB-server x86_64 10.3.7-1.el7.centos mariadb-main 123 M Installing for dependencies: MariaDB-client x86_64 10.3.7-1.el7.centos mariadb-main 53 M MariaDB-common x86_64 10.3.7-1.el7.centos mariadb-main 157 k galera x86_64 25.3.23-1.rhel7.el7.centos mariadb-main 8.0 M Transaction Summary ================================================================================================= Install 2 Packages (+40 Dependent packages) Total download size: 200 M Is this ok [y/d/N]: Y ... ... Complete!   Installation We will use mariadb user to install MariaDB on the MariaDB server. [root@mariadb101 ~] su - mysql [mysql@mariadb101 ~]$ sudo curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash [info] Repository file successfully written to /etc/yum.repos.d/mariadb.repo. [info] Adding trusted package signing keys... [info] Succeessfully added trusted package signing keys. [mysql@mariadb101 ~]$ sudo yum install MariaDB-server mariadb-main | 2.9 kB 00:00:00 mariadb-maxscale | 2.4 kB 00:00:00 mariadb-tools | 2.9 kB 00:00:00 (1/3): mariadb-maxscale/7/x86_64/primary_db | 6.3 kB 00:00:01 (2/3): mariadb-tools/7/x86_64/primary_db | 9.6 kB 00:00:01 (3/3): mariadb-main/7/x86_64/primary_db | 48 kB 00:00:01 Resolving Dependencies ================================================================================================ Package Arch Version Repository Size ================================================================================================ Installing: MariaDB-compat x86_64 10.3.7-1.el7.centos mariadb-main 2.8 M replacing mariadb-libs.x86_64 1:5.5.56-2.el7 MariaDB-server x86_64 10.3.7-1.el7.centos mariadb-main 123 M Installing for dependencies: MariaDB-client x86_64 10.3.7-1.el7.centos mariadb-main 53 M MariaDB-common x86_64 10.3.7-1.el7.centos mariadb-main 157 k galera x86_64 25.3.23-1.rhel7.el7.centos mariadb-main 8.0 M Transaction Summary ================================================================================================ Install 2 Packages (+40 Dependent packages) Total download size: 200 M Is this ok [y/d/N]: Y ... ... Complete! Enter "Y" when prompted, it will download and install MariaDB server and Client along with all its dependencies. Install MariaDB MaxScale 2.2.x Setup the Repository Login to the MaxScale server (102.168.56.102) and Use "curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash" Just like with MariaDB, we will use mariadb user to set up the MariaDB repositories and install MaxScale using yum [root@maxscale102 ~] curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash [info] Repository file successfully written to /etc/yum.repos.d/mariadb.repo. [info] Adding trusted package signing keys... [info] Succeessfully added trusted package signing keys. [root@maxscale102 ~] su - mysql Last login: Fri Jun 22 14:39:53 EDT 2018 on pts/0 [mysql@maxscale102 ~]$ sudo yum install maxscale mariadb-main | 2.9 kB 00:00:00 mariadb-maxscale | 2.4 kB 00:00:00 mariadb-tools | 2.9 kB 00:00:00 (1/3): mariadb-maxscale/7/x86_64/primary_db | 6.3 kB 00:00:01 (2/3): mariadb-tools/7/x86_64/primary_db | 9.6 kB 00:00:01 (3/3): mariadb-main/7/x86_64/primary_db | 48 kB 00:00:02 Resolving Dependencies Dependencies Resolved ================================================================================================ Package Arch Version Repository Size ================================================================================================ Installing: maxscale x86_64 2.2.9-1 mariadb-maxscale 18 M Installing for dependencies: gnutls x86_64 3.3.26-9.el7 base 677 k nettle x86_64 2.7.1-8.el7 base 327 k trousers x86_64 0.3.14-2.el7 base 289 k Transaction Summary ================================================================================================ Install 1 Package (+3 Dependent packages) Total download size: 19 M Installed size: 70 M Is this ok [y/d/N]: Y ... ... Complete! [mysql@maxscale102 ~]$ sudo systemctl enable maxscale Created symlink from /etc/systemd/system/multi-user.target.wants/maxscale.service to /usr/lib/systemd/system/maxscale.service. [mysql@maxscale102 ~]$ systemctl status maxscale ● maxscale.service - MariaDB MaxScale Database Proxy Loaded: loaded (/usr/lib/systemd/system/maxscale.service; enabled; vendor preset: disabled) Active: inactive (dead) Install CDC Server Setup the Repository Login to the CDC Server (102.168.56.103) we will need to set up two extra repositories that are currently not automatically added by the curl script. CDC Adapter, however, will automatically be taken care of by it. Refer to the following links to the direct path to the two RPMs mariadb-columnstore-api mariadb-columnstore-data-adapters Using the above URLs, install the API and Adapter directly from the path, the connector, however, should automatically download based on the repository setup. Sequence of Installation install maxscale-cdc-connector install epel-release install libuv install columnstore-api install cdc-adapters Install MaxScale CDC Connector [root@cdc103 ~] yum install maxscale-cdc-connector Resolving Dependencies Dependencies Resolved ================================================================================================ Package Arch Version Repository Size ================================================================================================ Installing: maxscale-cdc-connector x86_64 2.2.9-1 mariadb-maxscale 214 k Transaction Summary ================================================================================================ Install 1 Package Total download size: 214 k Installed size: 1.4 M Is this ok [y/d/N]: y ... ... Complete! Install Extra Packages for Enterprise Linux epel [root@cdc103 ~] yum install epel-release Resolving Dependencies Dependencies Resolved ============================================================================================== Package Arch Version Repository Size ============================================================================================== Installing: epel-release noarch 7-11 extras 15 k Transaction Summary ============================================================================================== Install 1 Package Total download size: 15 k Installed size: 24 k Is this ok [y/d/N]: y ... ... Complete! [root@cdc103 ~] yum install libuv Resolving Dependencies Dependencies Resolved ============================================================================================== Package Arch Version Repository Size ============================================================================================== Installing: libuv x86_64 1:1.19.2-1.el7 epel 121 k Transaction Summary ============================================================================================== Install 1 Package Total download size: 121 k Installed size: 308 k Is this ok [y/d/N]: y Downloading packages: ... ... Complete! Install ColumnStore API Package You can Install the API directly from the source or download the RPM using " wget " and then install it locally. [root@cdc103 ~] yum install https://downloads.mariadb.com/MariaDB/mariadb-columnstore-api/latest/yum/centos/7/x86_64/mariadb-columnstore-api-1.1.5-1-x86_64-centos7.rpm Resolving Dependencies Dependencies Resolved ============================================================================================= Package Arch Version Repository Size ============================================================================================= Installing: mariadb-columnstore-api x86_64 1.1.5-1 /mariadb-columnstore-api-1.1.5-1-x86_64 5.4 M Transaction Summary ============================================================================================= Install 1 Package ... ... Complete! Install CDC Adapter You can Install the CDC Adapter directly from the source or download the RPM using " wget " and then install it locally. [root@cdc103 ~] yum install https://downloads.mariadb.com/MariaDB/data-adapters/mariadb-streaming-data-adapters/latest/yum/centos/7/x86_64/mariadb-columnstore-maxscale-cdc-adapters-1.1.5-1-x86_64-centos7.rpm Resolving Dependencies Dependencies Resolved ================================================================================================ Package Arch Version Repository Size ================================================================================================ Installing: mariadb-columnstore-data-adapters x86_64 1.1.5-1 /mariadb-columnstore-maxscale 77 k Installing for dependencies: maxscale-cdc-connector x86_64 2.2.9-1 mariadb-maxscale 214 k Transaction Summary ================================================================================================ Install 1 Package (+1 Dependent package) Total size: 291 k Total download size: 214 k Installed size: 1.5 M Is this ok [y/d/N]: y ... ... Complete! Setup Communication Between CDC and ColumnStore Now that the CDC Adapter, API, and Connectors have been installed, we can start to set up communication between the CDC and ColumnStore servers. In this exercise, we are using single instance ColumnStore, in case of a distributed install, we will be working with ColumnStore UM1 Node. ColumnStore Configuration Connect to CDC server using ssh client and pull the /home/mysql/mariadb/columnstore/etc/Columnstore.xml from ColumnStore server using scp, rsync or any other method available for file transfer between the servers. Columnstore.xml should be downloaded to /etc and owned by root user. The file can be downloaded from ColumnStore or any node will do as the Columnstore.xml is automatically synchronized between all the nodes. Use mariadb as the remote user to connect to ColumnStore. As this is the first time connecting to ColumnStore from CDC server, Linux will ask for yes/no and for mariadb password. Once the file is downloaded, ensure that it has 644 permission so that everyone can read it. [root@cdc103 ~] scp mysql@192.168.56.104:/home/mysql/mariadb/columnstore/etc/Columnstore.xml /etc The authenticity of host '192.168.56.104 (192.168.56.104)' can't be established. ECDSA key fingerprint is SHA256:Jle/edRpKz9ysV8xp1K9TlIGvbg8Sb1p+GbDob3Id0g. ECDSA key fingerprint is MD5:a1:ce:9d:58:80:c6:ed:5a:95:7b:33:82:68:cb:0f:40. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.56.104' (ECDSA) to the list of known hosts. mysql@192.168.56.104's password: Columnstore.xml 100% 21KB 13.3MB/s 00:00 [root@cdc103 ~] [root@cdc103 ~] ls -rlt /etc/Columnstore.xml -rw-r--r-- 1 root root 21089 Jul 1 05:09 /etc/Columnstore.xml [root@cdc103 ~] Search and replace 127.0.0.1 with ColumnStore's IP address in the newly downloaded Columnstore.xml file using root user as the file is under /etc/ folder. [root@cdc103 etc] sed -i 's/127.0.0.1/192.168.56.104/g' /etc/Columnstore.xml [root@cdc103 etc] Setup Master MariaDB Replication Master Now that the CDC node is ready, it's time to setup MariaDB Server as the replication Master for MaxScale. MaxScale will be working as a Replication Slave for this data streaming setup. Login to the MariaDB Master server using the root user and add the following contents to the /etc/my.cnf.d/server.cnf file under [mysqld] section and restart MariaDB process. [mysqld] server_id=1 log-bin = mariadb-bin binlog-format = ROW gtid_strict_mode = 1 log_error log-slave-updates [root@mariadb101 ~] systemctl restart mariadb The server_id will be used when configuring MaxScale as a replication slave node. Setup Replication User After MariaDB restarts, login to MariaDB using root user and setup replication user maxuser and grant REPLICATION SLAVE privilege to it. The second step is to create a new database and a test table with some data that we are going to use for testing data streaming to ColumnStore. [mysql@mariadb101 ~] mysql -uroot -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 10 Server version: 10.3.7-MariaDB-log MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> RESET MASTER; Query OK, 0 rows affected (0.004 sec) MariaDB [(none)]> CREATE USER 'maxuser'@'%' IDENTIFIED BY 'maxpwd'; Query OK, 0 rows affected (0.000 sec) MariaDB [(none)]> GRANT REPLICATION SLAVE ON *.* TO 'maxuser'@'%'; Query OK, 0 rows affected (0.000 sec) MariaDB [(none)]> CREATE DATABASE cdc_test; Query OK, 1 row affected (0.000 sec) MariaDB [(none)]> USE cdc_test; Database changed MariaDB [cdc_test]> CREATE TABLE cdc_tab(id serial, col varchar(100)); Query OK, 0 rows affected (0.010 sec) MariaDB [cdc_test]> INSERT INTO cdc_tab(col) values ('Row 1'), ('Row 2'), ('Row 3'); Query OK, 3 rows affected (0.004 sec) Records: 3 Duplicates: 0 Warnings: 0 MariaDB [cdc_test]> UPDATE cdc_tab SET col = 'Updated Row 2' WHERE id = 2; Query OK, 1 row affected (0.004 sec) Rows matched: 1 Changed: 1 Warnings: 0 MariaDB [cdc_test]> INSERT INTO cdc_tab(col) values ('Row 4'), ('Row 5'), ('Row 6'); Query OK, 3 rows affected (0.004 sec) Records: 3 Duplicates: 0 Warnings: 0 MariaDB [cdc_test]> DELETE FROM cdc_tab where id=5; Query OK, 1 row affected (0.005 sec) MariaDB [cdc_test]> Events that are captured in the binary logs of the master MariaDB Server: INSERT 3 Records to cdc_tab. UPDATE col for a row in cdc_tab. INSERT 3 Records to cdc_tab. DELETE a row from cdc_tab. Setup MaxScale Log on to the MaxScale server, edit the /etc/maxscale.cnf file, remove everything under the [maxscale] threads=auto section and add the following configuration to it. This will set MaxScale for data streaming to ColumnStore using Avro Listener. Take note of the server_id in the [replication-router] section, this points to the master MariaDB server's ID that we defined earlier while setting up MariaDB as a master. #Replication configuration that points to a particular server_id and binlog [replication-router] type=service router=binlogrouter user=maxuser passwd=maxpwd server_id=2 master_id=1 binlogdir=/var/lib/maxscale mariadb10-compatibility=1 filestem=mariadb-bin #Replication listener that will listen to the Master DB using the port #6603 [replication-listener] type=listener service=replication-router protocol=MySQLClient port=6603 #Avro service that will generate JSON files form the Bin-Logs that were received from the MasterDB and store them in the using the replication-router [avro-router] type=service router=avrorouter source=replication-router avrodir=/var/lib/maxscale #Avro listener that is used by the Avro router on a specific port to be used by CDC [avro-listener] type=listener service=avro-router protocol=cdc port=4001 [replication-router] This section defines a bin-log router from Master MariaDB to MaxScale Slave using replication user maxuser which was created earlier. [replication-listener] This is the mysql client listener service. Any server with MariaDB client can connect to MaxScale on the port 6603 specified in this section. We will connect to MaxScale later to set it up as a REPLICATION SLAVE using this port. [avro-router] This is the router service that routes the bin-log data into AVRO (JSON) files. CDC will use these AVRO files to generate bulk loading scripts for ColumnStore database. [avro-listener] Avro Listener uses avro-router service, this listener is used by CDC Adapter to get the AVRO files and stream them to ColumnStore. Once /etc/maxscale.cnf has been modified on the MaxScale server, restart MaxScale service. [root@maxscale102 ~] systemctl restart maxscale Add CDC User / Password for avro-router service in MaxScale [mysql@maxscale102 ~]$ maxctrl call command cdc add_user avro-router cdcuser cdcpassword This user will be the user that MaxScale uses to generate Avro schema files to be transferred and loaded into ColumnStore by the MaxScale CDC Adapter. Setup MaxScale as a REPLICATION SLAVE Install MariaDB client on the MaxScale Server. We will use this client to connect to MaxScale MariaDB listener and set up MaxScale as a replication SLAVE for MariaDB TX. [mysql@maxscale102 ~]$ sudo yum -y install MariaDB-client Resolving Dependencies Dependencies Resolved ============================================================================================== Package Arch Version Repository Size ============================================================================================== Installing: MariaDB-client x86_64 10.3.7-1.el7.centos mariadb-main 53 M MariaDB-compat x86_64 10.3.7-1.el7.centos mariadb-main 2.8 M replacing mariadb-libs.x86_64 1:5.5.56-2.el7 Installing for dependencies: MariaDB-common x86_64 10.3.7-1.el7.centos mariadb-main 157 k Running transaction Installing : MariaDB-common-10.3.7-1.el7.centos.x86_64 1/31 Installing : MariaDB-compat-10.3.7-1.el7.centos.x86_64 2/31 ============================================================================================== Installed: MariaDB-client.x86_64 0:10.3.7-1.el7.centos MariaDB-compat.x86_64 0:10.3.7-1.el7.centos Dependency Installed: MariaDB-common.x86_64 0:10.3.7-1.el7.centos perl.x86_64 4:5.16.3-292.el7 perl-Carp.noarch 0:1.26-244.el7 perl-Encode.x86_64 0:2.51-7.el7 perl-Exporter.noarch 0:5.68-3.el7 perl-File-Path.noarch 0:2.09-2.el7 perl-File-Temp.noarch 0:0.23.01-3.el7 perl-Filter.x86_64 0:1.49-3.el7 perl-Getopt-Long.noarch 0:2.40-3.el7 perl-HTTP-Tiny.noarch 0:0.033-3.el7 perl-PathTools.x86_64 0:3.40-5.el7 perl-Pod-Escapes.noarch 1:1.04-292.el7 perl-Pod-Perldoc.noarch 0:3.20-4.el7 perl-Pod-Simple.noarch 1:3.28-4.el7 perl-Pod-Usage.noarch 0:1.63-3.el7 perl-Scalar-List-Utils.x86_64 0:1.27-248.el7 perl-Socket.x86_64 0:2.010-4.el7 perl-Storable.x86_64 0:2.45-3.el7 perl-Text-ParseWords.noarch 0:3.29-4.el7 perl-Time-HiRes.x86_64 4:1.9725-3.el7 perl-Time-Local.noarch 0:1.2300-2.el7 perl-constant.noarch 0:1.27-2.el7 perl-libs.x86_64 4:5.16.3-292.el7 perl-macros.x86_64 4:5.16.3-292.el7 perl-parent.noarch 1:0.225-244.el7 perl-podlators.noarch 0:2.5.1-3.el7 perl-threads.x86_64 0:1.87-4.el7 perl-threads-shared.x86_64 0:1.43-6.el7 Replaced: mariadb-libs.x86_64 1:5.5.56-2.el7 Complete! Now we can Connect to MaxScale MariaDB service using the installed client and setup MaxScale as a REPLICATION SLAVE; use proper Master MariaDB IP in the CHANGE MASTER TO MASTER_HOST followed by START SLAVE and SHOW SLAVE STATUS to ensure the slave is running without any issues. This setup is just like setting up a MariaDB Master/Slave replication, the only difference, in this case, is that we are setting up MaxScale as a bin-log router slave. Take note that, we have used -h 192.168.56.102 argument to pass in MaxScale's IP address and -P 6603 argument to pass in the [replication-listener] port from the /etc/maxscale.cnf file. That port is for mysqlclient service, that is why we are able to connect to MaxScale's MariaDB interface. [mysql@maxscale102 ~]$ mysql -h 192.168.56.102 -u maxuser -p maxpwd -P 6603 Enter password: ERROR 1049 (42000): Unknown database 'maxpwd' [mysql@maxscale102 ~]$ mysql -h192.168.56.102 -umaxuser -pmaxpwd -P6603 Welcome to the MariaDB monitor. Commands end with ; or \g. Your MySQL connection id is 4 Server version: 10.2.12 2.2.9-maxscale Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MySQL [(none)]> CHANGE MASTER TO MASTER_HOST='192.168.56.101', MASTER_PORT=3306, MASTER_USER='maxuser', MASTER_PASSWORD='maxpwd', MASTER_LOG_FILE='mariadb-bin.000001', MASTER_LOG_POS=4; Query OK, 0 rows affected (0.001 sec) MySQL [(none)]> START SLAVE; Query OK, 0 rows affected (0.201 sec) MySQL [(none)]> SHOW SLAVE STATUS\G *************************** 1. row *************************** Slave_IO_State: Binlog Dump Master_Host: 192.168.56.101 Master_User: maxuser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000001 Read_Master_Log_Pos: 2047 Relay_Log_File: mariadb-bin.000001 Relay_Log_Pos: 2047 Relay_Master_Log_File: mariadb-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2047 Relay_Log_Space: 2047 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 5a2189b0-7d1a-11e8-b04f-080027ad63ed Master_Info_File: /var/lib/maxscale/master.ini SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave running Master_Retry_Count: 1000 Master_Bind: Last_IO_Error_TimeStamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 1 row in set (0.000 sec) MySQL [(none)]> Start Data Streaming With all the settings in place, we can now start data streaming from MariaDB TX to ColumnStore in MariaDB AX using the CDC and Avro data. Log in to CDC Server and execute mxs_adapter to start streaming. [mysql@cdc103 ~]$ mxs_adapter -c /etc/Columnstore.xml -u cdcuser -p cdcpassword -h 192.168.56.102 -P 4001 cdc_test cdc_tab Let's review this command and its arguments: -c indicates the path of the Columnstore.xml configuration file -u / -p indicates the CDC User and Password that were created earlier -h is the IP address of the MaxScale server -P is the port on which MaxScale [avro-listener] service is listening to cdc_test is the database name on the source (MariaDB) and target (ColumnStore) cdc_tab is the table name on the source (MariaDB) and target (ColumnStore) The target database/table should be already created before starting mxs_adapter, if not, mxs_adapter will provide a script to create the table in ColumnStore automatically. Let's start the service and see what happens. [mysql@cdc103 ~]$ mxs_adapter -c /etc/Columnstore.xml -u cdcuser -p cdcpassword -h 192.168.56.102 -P 4001 cdc_test cdc_tab Table not found, create with: CREATE TABLE cdc_test.cdc_tab (col varchar(100), domain int, event_number int, event_type varchar(50), id serial, sequence int, server_id int, timestamp int) ENGINE=ColumnStore; [mysql@cdc103 ~]$ As expected, the error message "Table not found", copy the CREATE TABLE script and execute it on the ColumnStore node using mcsmysql interface. [root@um1 ~] mcsmysql Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 43 Server version: 10.2.15-MariaDB-log Columnstore 1.1.5-1 Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> CREATE DATABASE cdc_test; Query OK, 1 row affected (0.00 sec) MariaDB [(none)]> CREATE TABLE cdc_test.cdc_tab (col varchar(100), domain int, event_number int, event_type varchar(50), id serial, sequence int, server_id int, timestamp int) ENGINE=ColumnStore; ERROR 1069 (42000): Too many keys specified; max 0 keys allowed MariaDB [(none)]> There is another problem! Our source table had a serial column which MariaDB treats as a Primary Key. Since ColumnStore does not support Primary Keys / Indexes we will need to change the CREATE TABLE script before executing it in the ColumnStore DB. Simply change, serial to bigint unsigned . Let's log back into ColumnStore mcsmysql and recreate the table. MariaDB [(none)]> DROP TABLE cdc_test.cdc_tab; Query OK, 0 rows affected (0.02 sec) MariaDB [(none)]> CREATE TABLE cdc_test.cdc_tab (col varchar(100), domain int, event_number int, event_type varchar(50), id bigint unsigned, sequence int, server_id int, timestamp int) ENGINE=ColumnStore; Query OK, 0 rows affected (0.55 sec) MariaDB [(none)]> Another thing to note here is that the table structure in ColumnStore is quite different from the source. There are a few additional columns. This table is an event table that captures all the INSERT, UPDATE and DELETE events. We can use this data to identify the latest row for each specific ID and run some analytics. One more thing to notice here is that at the moment, one mxs_adapter can only handle one table at a time. This is to prevent streaming the entire MariaDB databases to ColumnStore. Instead, we could create aggregate tables on MariaDB TX and stream that data to ColumnStore for better analytics use case. Let's start the mxs_adapter once more and see if it can stream the data from our source MariaDB TX to target MariaDB AX. [mysql@cdc103 ~]$ mxs_adapter -c /etc/Columnstore.xml -u cdcuser -p cdcpassword -h 192.168.56.102 -P 4001 cdc_test cdc_tab 4 rows and 1 transactions inserted in 3.3203 seconds. GTID = 0-1-6 2 rows and 1 transactions inserted in 0.792351 seconds. GTID = 0-1-7 3 rows and 1 transactions inserted in 0.604614 seconds. GTID = 0-1-8 Success! The three events are captured, INSERT, UPDATE and DELETE and their GTIDs. Let's log in to ColumnStore and review the data in the cdc_tab table. MariaDB [cdc_test]> SHOW TABLES; +--------------------+ | Tables_in_cdc_test | +--------------------+ | cdc_tab | +--------------------+ 1 row in set (0.00 sec) MariaDB [cdc_test]> SELECT * FROM cdc_tab; +---------------+--------+--------------+---------------+------+----------+-----------+------------+ | col | domain | event_number | event_type | id | sequence | server_id | timestamp | +---------------+--------+--------------+---------------+------+----------+-----------+------------+ | Row 1 | 0 | 1 | insert | 1 | 5 | 1 | 1530439436 | | Row 2 | 0 | 2 | insert | 2 | 5 | 1 | 1530439436 | | Row 3 | 0 | 3 | insert | 3 | 5 | 1 | 1530439436 | | Row 2 | 0 | 1 | update_before | 2 | 6 | 1 | 1530439675 | | Updated Row 2 | 0 | 2 | update_after | 2 | 6 | 1 | 1530439675 | | Row 4 | 0 | 1 | insert | 4 | 7 | 1 | 1530439697 | | Row 5 | 0 | 2 | insert | 5 | 7 | 1 | 1530439697 | | Row 6 | 0 | 3 | insert | 6 | 7 | 1 | 1530439697 | | Row 5 | 0 | 1 | delete | 5 | 8 | 1 | 1530439714 | +---------------+--------+--------------+---------------+------+----------+-----------+------------+ 9 rows in set (0.17 sec) MariaDB [cdc_test]> We can see the INSERT, UPDATE (Before and After) and a DELETE. SEQUENCE column indicates the sequence in which these events were triggered and the timestamp column indicates the time as and when these events took place. Avro files While setting up MaxScale we specified avrodir as /var/lib/mysql . This can be any different location. [avro-router] type=service router=avrorouter source=replication-router avrodir=/var/lib/maxscale Let's see the contents of the avrodir folder. We can see two files generated .avsc contains the data structure while .avro contains the actual data from the bin logs. [root@maxscale102 maxscale] pwd /var/lib/maxscale [root@maxscale102 maxscale] ls -rlt total 40 drwxr-xr-x 2 maxscale maxscale 6 Jun 22 14:44 maxscale.cnf.d -rw-r--r-- 1 maxscale maxscale 54 Jun 22 14:44 maxadmin-users -rw-r--r-- 1 maxscale maxscale 84 Jun 22 14:44 passwd drwxr--r-- 2 maxscale maxscale 25 Jul 1 04:25 MariaDB-Monitor drwxr-xr-x 2 maxscale maxscale 6 Jul 1 06:34 data1288 drwxr-xr-x 2 maxscale maxscale 22 Jul 1 06:46 avro-router -rw------- 1 maxscale maxscale 183 Jul 1 06:50 master.ini drwx------ 2 maxscale maxscale 199 Jul 1 06:50 cache -rw-r--r-- 1 maxscale maxscale 4096 Jul 1 06:50 gtid_maps.db -rw-r--r-- 1 maxscale maxscale 2047 Jul 1 06:50 mariadb-bin.000001 -rw-r--r-- 1 maxscale maxscale 572 Jul 1 06:50 cdc_test.cdc_tab.000001.avsc -rw-r--r-- 1 maxscale maxscale 797 Jul 1 06:50 cdc_test.cdc_tab.000001.avro -rw-r--r-- 1 maxscale maxscale 5120 Jul 1 06:50 avro.index -rw-r--r-- 1 maxscale maxscale 69 Jul 1 07:29 avro-conversion.ini [root@maxscale102 maxscale] In case of some issue with the data streaming, one can try to restart MaxScale service. If there's still a failure, as a last resort one can delete *.avsc, *.avro, avro.index and avro-conversion.ini and restart the MaxScale service. It should be able to recover. Since the mxs_adapter is running, as and when new data is inserted in the cdc_test.cdc_tab table, it will automatically be streamed into ColumnStore. Conclusion Using this setup, we can stream data directly from OLTP MariaDB TX not only to MariaDB AX but also to other sources that can take Avro/JSON data. There is a Kafka Adapter already available in the Data Adapters download page. References MariaDB TX download (includes MariaDB MaxScale) MariaDB AX download (includes MariaDB ColumnStore) CDC Connectors CDC Adapter Thanks. Big Data ColumnStore How to MariaDB Releases MaxScale This blog is about setting up Data Streaming from MariaDB TX to MariaDB AX. We will be using Virtual Box CentOS images. Login or Register to post comments