Quantcast
Channel: Severalnines - galera
Viewing all 111 articles
Browse latest View live

Severalnines Launches #MySQLHA CrowdChat

$
0
0

Today we launch our live CrowdChat on everything #MySQLHA!

This CrowdChat is brought to you by Severalnines and is hosted by a community of subject matter experts. CrowdChat is a community platform that works across Facebook, Twitter, and LinkedIn to allow users to discuss a topic using a specific #hashtag. This crowdchat focuses on the hashtag #MySQLHA. So if you’re a DBA, architect, CTO, or a database novice register to join and become part of the conversation!

Join this online community to interact with experts on Galera clusters. Get your questions answered and join the conversation around everything #MySQLHA.

Register free

Meet the experts

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 15 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

Krzysztof Książek is a Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.

Ashraf Sharif is a System Support Engineer at Severalnines. He has previously worked as principal consultant and head of support team and delivered clustering solutions for big websites in the South East Asia region. His professional interests focus on system scalability and high availability.

Vinay Joosery is a passionate advocate and builder of concepts and businesses around Big Data computing infrastructures. Prior to co-founding Severalnines, Vinay held the post of Vice-President EMEA at Pentaho Corporation - the Open Source BI leader. He has also held senior management roles at MySQL / Sun Microsystems / Oracle, where he headed the Global MySQL Telecoms Unit, and built the business around MySQL's High Availability and Clustering product lines. Prior to that, Vinay served as Director of Sales & Marketing at Ericsson Alzato, an Ericsson-owned venture focused on large scale real-time databases.


Infrastructure Automation - Deploying ClusterControl and MySQL-based systems on AWS using Ansible

$
0
0

We recently made a number of enhancements to the ClusterControl Ansible Role, so it now also supports automatic deployment of MySQL-based systems (MySQL Replication, Galera Cluster, NDB Cluster). The updated role uses the awesome ClusterControl RPC interface to automate deployments. It is available at Ansible Galaxy and Github.

TLDR; It is now possible to define your database clusters directly in a playbook (see below example), and let Ansible and ClusterControl automate the entire deployment:

cc_cluster:
  - deployment: true
    operation: "create"
    cluster_type: "galera"
    mysql_hostnames:
      - "192.168.1.101"
      - "192.168.1.102"
      - "192.168.1.103"
    mysql_password: "MyPassword2016"
    mysql_port: 3306
    mysql_version: "5.6"
    ssh_keyfile: "/root/.ssh/id_rsa"
    ssh_user: "root"
    vendor: "percona"

What’s New?

The major improvement is that you can now automatically deploy a new database setup while deploying ClusterControl. Or you can also register already deployed databases.

Define your database cluster in the playbook, within the “cc_cluster” item, and ClusterControl will perform the deployment. We also introduced a bunch of new variables to simplify the initial setup of ClusterControl for default admin credentials, ClusterControl license and LDAP settings.

Along with these new improvements, we can leverage the Ansible built-in cloud module to automate the rest of the infrastructure that our database rely on - instance provisioning, resource allocation, network configuration and storage options. In simple words, write your infrastructure as code and let Ansible work together with ClusterControl to build the entire stack.

We also included example playbooks in the repository for reference. Check them out at our Github repository page.

Example Deployment on Amazon EC2

In this example we are going to deploy two clusters on Amazon EC2 using our new role:

  • 1 node for ClusterControl
  • 3 nodes Galera Cluster (Percona XtraDB Cluster 5.6)
  • 4 nodes MySQL Replication (Percona Server 5.7)

The following diagram illustrates our setup with Ansible:

First, let’s decide what our infrastructure in AWS will look like:

  • Region: us-west-1
  • Availability Zone: us-west-1a
  • AMI ID: ami-d1315fb1
    • AMI name: RHEL 7.2 HVM
    • SSH user: ec2-user
  • Instance size: t2.medium
  • Keypair: mykeypair
  • VPC subnet: subnet-9ecc2dfb
  • Security group: default

Preparing our Ansible Master

We are using Ubuntu 14.04 as the Ansible master host in a local data-center to deploy our cluster on AWS EC2.

  1. If you are already have Ansible installed, you may skip this step:

    $ apt-get install ansible python-setuptools
  2. Install boto (required by ec2 python script):

    $ pip install boto
  3. Download and configure ec2.py and ec2.ini under /etc/ansible. Ensure the Python script is executable:

    $ cd /etc/ansible
    $ wget https://raw.githubusercontent.com/ansible/ansible/devel/contrib/inventory/ec2.py
    $ wget https://raw.githubusercontent.com/ansible/ansible/devel/contrib/inventory/ec2.ini
    $ chmod 755 /etc/ansible/ec2.py
  4. Set up Secret and Access key environment variables. Get the AWS Secret and Access Key from Amazon EC2 Identity and Access Management (IAM) and configure them as per below:

    $ export AWS_ACCESS_KEY_ID='YOUR_AWS_API_KEY'
    $ export AWS_SECRET_ACCESS_KEY='YOUR_AWS_API_SECRET_KEY'
  5. Configure environment variables for ec2.py and ec2.ini.

    export ANSIBLE_HOSTS=/etc/ansible/ec2.py
    export EC2_INI_PATH=/etc/ansible/ec2.ini
  6. Configure AWS keypair. Ensure the keypair exists on the node. For example, if the keypair is located under /root/mykeypair.pem, use the following command to add it to the SSH agent:

    $ ssh-agent bash
    $ ssh-add /root/mykeypair.pem
  7. Verify if Ansible can see our cloud. If there are running EC2 instances, you should get a list of them in JSON by running this command:

    $ /etc/ansible/ec2.py --list

**Note that you can put steps 4 and 5 inside .bashrc or .bash_profile file to load the environment variables automatically.

Define our infrastructure inside the Ansible playbook

We are going to create 2 playbooks. The first one is the definition of our EC2 instances in AWS (ec2-instances.yml) and the second one to deploy ClusterControl and the database clusters (deploy-everything.yml).

Here is an example content of ec2-instances.yml:

- name: Create instances
  hosts: localhost
  gather_facts: False

  tasks:
    - name: Provision ClusterControl node
      ec2:
        count: 1
        region: us-east-1
        zone: us-east-1a
        key_name: mykeypair
        group: default
        instance_type: t2.medium
        image: ami-3f03c55c
        wait: yes
        wait_timeout: 500
        volumes:
          - device_name: /dev/sda1
            device_type: standard
            volume_size: 20
            delete_on_termination: true
        monitoring: no
        vpc_subnet_id: subnet-9ecc2dfb
        assign_public_ip: yes
        instance_tags:
          Name: clustercontrol
          set: ansible
          group: clustercontrol

    - name: Provision Galera nodes
      ec2:
        count: 3
        region: us-east-1
        zone: us-east-1a
        key_name: mykeypair
        group: default
        instance_type: t2.medium
        image: ami-3f03c55c
        wait: yes
        wait_timeout: 500
        volumes:
          - device_name: /dev/sdf
            device_type: standard
            volume_size: 20
            delete_on_termination: true
        monitoring: no
        vpc_subnet_id: subnet-9ecc2dfb
        assign_public_ip: yes
        instance_tags:
          Name: galeracluster
          set: ansible
          group: galeracluster

    - name: Provision MySQL Replication nodes
      ec2:
        count: 4
        region: us-east-1
        zone: us-east-1a
        key_name: mykeypair
        group: default
        instance_type: t2.medium
        image: ami-3f03c55c
        wait: yes
        wait_timeout: 500
        volumes:
          - device_name: /dev/sdf
            device_type: standard
            volume_size: 20
            delete_on_termination: true
        monitoring: no
        vpc_subnet_id: subnet-9ecc2dfb
        assign_public_ip: yes
        instance_tags:
          Name: replication
          set: ansible
          group: replication

There are three types of instances with different instance_tags (Name, set and group) in the playbook. The “group” tag distinguishes our host group accordingly so it can be called in our deployment playbook as part of Ansible host inventory. The "set" tag marks the instances were created by Ansible. Since we are provisioning everything from a local data-center, we set assign_public_ip to “yes” so the instances are reachable inside a VPC under “subnet-9ecc2dfb”.

Next, we create the deployment playbook as per below (deploy-everything.yml):

- name: Configure ClusterControl instance.
  hosts: tag_group_clustercontrol
  become: true
  user: ec2-user
  gather_facts: true

  roles:
    - { role: severalnines.clustercontrol, tags: controller }

  vars:
    cc_admin:
      - email: "admin@email.com"
        password: "test123"

- name: Configure Galera Cluster and Replication instances.
  hosts:
    - tag_group_galeracluster
    - tag_group_replication
  user: ec2-user
  become: true
  gather_facts: true

  roles:
    - { role: severalnines.clustercontrol, tags: dbnodes }

  vars:
    clustercontrol_ip_address: "{{ hostvars[groups['tag_group_clustercontrol'][0]]['ec2_ip_address'] }}"

- name: Create the database clusters.
  hosts: tag_group_clustercontrol
  become: true
  user: ec2-user

  roles:
    - { role: severalnines.clustercontrol, tags: deploy-database }

  vars:
    cc_cluster:
      - deployment: true
        operation: "create"
        cluster_type: "galera"
        mysql_cnf_template: "my.cnf.galera"
        mysql_datadir: "/var/lib/mysql"
        mysql_hostnames:
          - "{{ hostvars[groups['tag_group_galeracluster'][0]]['ec2_ip_address'] }}"
          - "{{ hostvars[groups['tag_group_galeracluster'][1]]['ec2_ip_address'] }}"
          - "{{ hostvars[groups['tag_group_galeracluster'][2]]['ec2_ip_address'] }}"
        mysql_password: "password"
        mysql_port: 3306
        mysql_version: "5.6"
        ssh_keyfile: "/root/.ssh/id_rsa"
        ssh_user: "root"
        sudo_password: ""
        vendor: "percona"
      - deployment: true
        operation: "create"
        cluster_type: "replication"
        mysql_cnf_template: "my.cnf.repl57"
        mysql_datadir: "/var/lib/mysql"
        mysql_hostnames:
          - "{{ hostvars[groups['tag_group_replication'][0]]['ec2_ip_address'] }}"
          - "{{ hostvars[groups['tag_group_replication'][1]]['ec2_ip_address'] }}"
          - "{{ hostvars[groups['tag_group_replication'][2]]['ec2_ip_address'] }}"
          - "{{ hostvars[groups['tag_group_replication'][3]]['ec2_ip_address'] }}"
        mysql_password: "password"
        mysql_port: 3306
        mysql_version: "5.7"
        ssh_keyfile: "/root/.ssh/id_rsa"
        ssh_user: "root"
        sudo_password: ""
        vendor: "percona"

The ansible user is “ec2-user” for RHEL 7.2 image. The playbook shows the deployment flow as:

  1. Install and configure ClusterControl (tags: controller)
  2. Setup passwordless SSH from ClusterControl node to all database nodes (tags: dbnodes). In this section, we have to define clustercontrol_ip_address so we know which ClusterControl node is used to manage our nodes.
  3. Perform the database deployment. The database cluster item definition will be passed to the ClusterControl RPC interface listening on the EC2 instance that has “tag_group_clustercontrol”. For MySQL replication, the first node in the mysql_hostnames list is the master.

The above are the simplest variables used to get you started. For more customization options, you can refer to the documentation page of the role under Variables section.

Fire them up

You need to have the Ansible role installed. Grab it from Ansible Galaxy or Github repository:

$ ansible-galaxy install severalnines.clustercontrol

Then, create the EC2 instances:

$ ansible-playbook -i /etc/ansible/ec2.py ec2-instances.yml

Refresh the inventory:

$ /etc/ansible/ec2.py --refresh-cache

Verify all EC2 instances are reachable before the deployment begins (you should get SUCCESS for all nodes):

$ ansible -m ping "tag_set_ansible" -u ec2-user

Install ClusterControl and deploy the database cluster:

$ ansible-playbook -i /etc/ansible/ec2.py deploy-everything.yml

Wait for a couple of minutes until the playbook completes. Then, login to ClusterControl using the default email address and password defined in the playbook and you should be inside the ClusterControl dashboard. Go to Settings -> Cluster Job, you should see the “Create Cluster” jobs scheduled and deployment is under progress.

This is our final result on ClusterControl dashboard:

The total deployment time from installing Ansible to the database deployment took about 50 minutes. This included waiting for the instances to be created and database clusters to be deployed. This is pretty good, considering we were spinning 8 nodes and deploying two database clusters from scratch. How long does it take you to deploy two clusters from scratch?

Future Plan

At the moment, the Ansible role only supports deployment of the following:

  • Create new Galera Cluster
    • Percona XtraDB Cluster (5.5/5.6)
    • MariaDB Galera Cluster (5.510.1)
    • MySQL Galera Cluster - Codership (5.5/5.6)
  • Create new MySQL Replication
    • Percona Server (5.7/5.6)
    • MariaDB Server (10.1)
    • MySQL Server - Oracle (5.7)
  • Add existing Galera Cluster
    • Percona/MariaDB/Codership (all stable version)

We’re in the process of adding support for other cluster types supported by ClusterControl.

We’d love to hear your feedback in the comments section below. Would you like to see integration with more cloud providers (Azure, Google Cloud Platform, Rackspace)? What about virtualization platforms like OpenStack, VMware, Vagrant and Docker? How about load balancers (HAProxy, MaxScale and ProxySQL)? And Galera arbitrator (garbd), asynchronous replication slaves to Galera clusters, and backup management right from the Ansible playbook? The list can be very long, so let us know what is important to you. Happy automation!

MySQL on Docker: Introduction to Docker Swarm Mode and Multi-Host Networking

$
0
0

In the previous blog post, we looked into Docker’s single-host networking for MySQL containers. This time, we are going to look into the basics of multi-host networking and Docker swarm mode, a built-in orchestration tool to manage containers across multiple hosts.

Docker Engine - Swarm Mode

Running MySQL containers on multiple hosts can get a bit more complex depending on the clustering technology you choose.

Before we try to run MySQL on containers + multi-host networking, we have to understand how the image works, how much resources to allocate (disk,memory,CPU), networking (the overlay network drivers - default, flannel, weave, etc) and fault tolerance (how is the container relocated, failed over and load balanced). Because all these will impact the overall operations, uptime and performance of the database. It is recommended to use an orchestration tool to get more manageability and scalability on top of your Docker engine cluster. The latest Docker Engine (version 1.12, released on July 14th, 2016) includes swarm mode for natively managing a cluster of Docker Engines called a Swarm. Take note that Docker Engine Swarm mode and Docker Swarm are two different projects, with different installation steps despite they both work in a similar way.

Some of the noteworthy parts that you should know before entering the swarm world:

  • The following ports must be opened:
    • 2377 (TCP) - Cluster management
    • 7946 (TCP and UDP) - Nodes communication
    • 4789 (TCP and UDP) - Overlay network traffic
  • There are 2 types of nodes:
    • Manager - Manager nodes perform the orchestration and cluster management functions required to maintain the desired state of the swarm. Manager nodes elect a single leader to conduct orchestration tasks.
    • Worker - Worker nodes receive and execute tasks dispatched from manager nodes. By default, manager nodes are also worker nodes, but you can configure managers to be manager-only nodes.

More details in the Docker Engine Swarm documentation.

In this blog, we are going to deploy application containers on top of a load-balanced Galera Cluster on 3 Docker hosts (docker1, docker2 and docker3), connected through an overlay network as a proof of concept for MySQL clustering in multiple Docker hosts environment. We will use Docker Engine Swarm mode as the orchestration tool.

“Swarming” Up

Let’s cluster our Docker nodes into a Swarm. Swarm mode requires an odd number of managers (obviously more than one) to maintain quorum for fault tolerance. So, we are going to use all the physical hosts as manager nodes. Note that by default, manager nodes are also worker nodes.

  1. Firstly, initialize Swarm mode on docker1. This will make the node as manager and leader:

    [root@docker1]$ docker swarm init --advertise-addr 192.168.55.111
    Swarm initialized: current node (6r22rd71wi59ejaeh7gmq3rge) is now a manager.
    
    To add a worker to this swarm, run the following command:
    
        docker swarm join \
        --token SWMTKN-1-16kit6dksvrqilgptjg5pvu0tvo5qfs8uczjq458lf9mul41hc-dzvgu0h3qngfgihz4fv0855bo \
        192.168.55.111:2377
    
    To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
  2. We are going to add two more nodes as manager. Generate the join command for other nodes to register as manager:

    [docker1]$ docker swarm join-token manager
    To add a manager to this swarm, run the following command:
    
        docker swarm join \
        --token SWMTKN-1-16kit6dksvrqilgptjg5pvu0tvo5qfs8uczjq458lf9mul41hc-7fd1an5iucy4poa4g1bnav0pt \
        192.168.55.111:2377
  3. On docker2 and docker3, run the following command to register the node:

    $ docker swarm join \
        --token SWMTKN-1-16kit6dksvrqilgptjg5pvu0tvo5qfs8uczjq458lf9mul41hc-7fd1an5iucy4poa4g1bnav0pt \
        192.168.55.111:2377
  4. Verify if all nodes are added correctly:

    [docker1]$ docker node ls
    ID                           HOSTNAME       STATUS  AVAILABILITY  MANAGER STATUS
    5w9kycb046p9aj6yk8l365esh    docker3.local  Ready   Active        Reachable
    6r22rd71wi59ejaeh7gmq3rge *  docker1.local  Ready   Active        Leader
    awlh9cduvbdo58znra7uyuq1n    docker2.local  Ready   Active        Reachable

    At the moment, we have docker1.local as the leader. 

Overlay Network

The only way to let containers running on different hosts connect to each other is by using an overlay network. It can be thought of as a container network that is built on top of another network (in this case, the physical hosts network). Docker Swarm mode comes with a default overlay network which implements a VxLAN-based solution with the help of libnetwork and libkv. You can however choose another overlay network driver like Flannel, Calico or Weave, where extra installation steps are necessary. We are going to cover more on that later in an upcoming blog post.

In Docker Engine Swarm mode, you can create an overlay network only from a manager node and it doesn’t need an external key-value store like etcd, consul or Zookeeper.

The swarm makes the overlay network available only to nodes in the swarm that require it for a service. When you create a service that uses an overlay network, the manager node automatically extends the overlay network to nodes that run service tasks.

Let’s create an overlay network for our containers. We are going to deploy Percona XtraDB Cluster and application containers on separate Docker hosts to achieve fault tolerance. These containers must be running on the same overlay network so they can communicate with each other.

We are going to name our network “mynet”. You can only create this on the manager node:

[docker1]$ docker network create --driver overlay mynet

Let’s see what networks we have now:

[docker1]$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
213ec94de6c9        bridge              bridge              local
bac2a639e835        docker_gwbridge     bridge              local
5b3ba00f72c7        host                host                local
03wvlqw41e9g        ingress             overlay             swarm
9iy6k0gqs35b        mynet               overlay             swarm
12835e9e75b9        none                null                local

There are now 2 overlay networks with a Swarm scope. The “mynet” network is what we are going to use today when deploying our containers. The ingress overlay network comes by default. The swarm manager uses ingress load balancing to expose the services you want externally to the swarm.

Deployment using Services and Tasks

We are going to deploy the Galera Cluster containers through services and tasks. When you create a service, you specify which container image to use and which commands to execute inside running containers. There are two type of services:

  • Replicated services - Distributes a specific number of replica tasks among the nodes based upon the scale you set in the desired state, for examples “--replicas 3”.
  • Global services - One task for the service on every available node in the cluster, for example “--mode global”. If you have 7 Docker nodes in the Swarm, there will be one container on each of them.

Docker Swarm mode has a limitation in managing persistent data storage. When a node fails, the manager will get rid of the containers and create new containers in place of the old ones to meet the desired replica state. Since a container is discarded when it goes down, we would lose the corresponding data volume as well. Fortunately for Galera Cluster, the MySQL container can be automatically provisioned with state/data when joining.

Deploying Key-Value Store

The docker image that we are going to use is from Percona-Lab. This image requires the MySQL containers to access a key-value store (supports etcd only) for IP address discovery during cluster initialization and bootstrap. The containers will look for other IP addresses in etcd, if there are any, start the MySQL with a proper wsrep_cluster_address. Otherwise, the first container will start with the bootstrap address, gcomm://.

  1. Let’s deploy our etcd service. We will use etcd image available here. It requires us to have a discovery URL on the number of etcd node that we are going to deploy. In this case, we are going to setup a standalone etcd container, so the command is:

    [docker1]$ curl -w "\n"'https://discovery.etcd.io/new?size=1'
    https://discovery.etcd.io/a293d6cc552a66e68f4b5e52ef163d68
  2. Then, use the generated URL as “-discovery” value when creating the service for etcd:

    [docker1]$ docker service create \
    --name etcd \
    --replicas 1 \
    --network mynet \
    -p 2379:2379 \
    -p 2380:2380 \
    -p 4001:4001 \
    -p 7001:7001 \
    elcolio/etcd:latest \
    -name etcd \
    -discovery=https://discovery.etcd.io/a293d6cc552a66e68f4b5e52ef163d68

    At this point, Docker swarm mode will orchestrate the deployment of the container on one of the Docker hosts.

  3. Retrieve the etcd service virtual IP address. We are going to use that in the next step when deploying the cluster:

    [docker1]$ docker service inspect etcd -f "{{ .Endpoint.VirtualIPs }}"
    [{03wvlqw41e9go8li34z2u1t4p 10.255.0.5/16} {9iy6k0gqs35bn541pr31mly59 10.0.0.2/24}]

    At this point, our architecture looks like this:

Deploying Database Cluster

  1. Specify the virtual IP address for etcd in the following command to deploy Galera (Percona XtraDB Cluster) containers:

    [docker1]$ docker service create \
    --name mysql-galera \
    --replicas 3 \
    -p 3306:3306 \
    --network mynet \
    --env MYSQL_ROOT_PASSWORD=mypassword \
    --env DISCOVERY_SERVICE=10.0.0.2:2379 \
    --env XTRABACKUP_PASSWORD=mypassword \
    --env CLUSTER_NAME=galera \
    perconalab/percona-xtradb-cluster:5.6
  2. It takes some time for the deployment where the image will be downloaded on the assigned worker/manager node. You can verify the status with the following command:

    [docker1]$ docker service ps mysql-galera
    ID                         NAME                IMAGE                                  NODE           DESIRED STATE  CURRENT STATE            ERROR
    8wbyzwr2x5buxrhslvrlp2uy7  mysql-galera.1      perconalab/percona-xtradb-cluster:5.6  docker1.local  Running        Running 3 minutes ago
    0xhddwx5jzgw8fxrpj2lhcqeq  mysql-galera.2      perconalab/percona-xtradb-cluster:5.6  docker3.local  Running        Running 2 minutes ago
    f2ma6enkb8xi26f9mo06oj2fh  mysql-galera.3      perconalab/percona-xtradb-cluster:5.6  docker2.local  Running        Running 2 minutes ago
  3. We can see that the mysql-galera service is now running. Let’s list out all services we have now:

    [docker1]$ docker service ls
    ID            NAME          REPLICAS  IMAGE                                  COMMAND
    1m9ygovv9zui  mysql-galera  3/3       perconalab/percona-xtradb-cluster:5.6
    au1w5qkez9d4  etcd          1/1       elcolio/etcd:latest                    -name etcd -discovery=https://discovery.etcd.io/a293d6cc552a66e68f4b5e52ef163d68
  4. Swarm mode has an internal DNS component that automatically assigns each service in the swarm a DNS entry. So you use the service name to resolve to the virtual IP address:

    [docker2]$ docker exec -it $(docker ps | grep etcd | awk {'print $1'}) ping mysql-galera
    PING mysql-galera (10.0.0.4): 56 data bytes
    64 bytes from 10.0.0.4: seq=0 ttl=64 time=0.078 ms
    64 bytes from 10.0.0.4: seq=1 ttl=64 time=0.179 ms

    Or, retrieve the virtual IP address through the “docker service inspect” command:

    [docker1]# docker service inspect mysql-galera -f "{{ .Endpoint.VirtualIPs }}"
    [{03wvlqw41e9go8li34z2u1t4p 10.255.0.7/16} {9iy6k0gqs35bn541pr31mly59 10.0.0.4/24}]

    Our architecture now can be illustrated as below:

Deploying Applications

Finally, you can create the application service and pass the MySQL service name (mysql-galera) as the database host value:

[docker1]$ docker service create \
--name wordpress \
--replicas 2 \
-p 80:80 \
--network mynet \
--env WORDPRESS_DB_HOST=mysql-galera \
--env WORDPRESS_DB_USER=root \
--env WORDPRESS_DB_PASSWORD=mypassword \
wordpress

Once deployed, we can then retrieve the virtual IP address for wordpress service through the “docker service inspect” command:

[docker1]# docker service inspect wordpress -f "{{ .Endpoint.VirtualIPs }}"
[{p3wvtyw12e9ro8jz34t9u1t4w 10.255.0.11/16} {kpv8e0fqs95by541pr31jly48 10.0.0.8/24}]

At this point, this is what we have:

Our distributed application and database setup is now deployed by Docker containers.

ClusterControl
Single Console for Your Entire Database Infrastructure
Deploy, manage, monitor, scale your databases on the technology stack of your choice!

Connecting to the Services and Load Balancing

At this point, the following ports are published (based on the -p flag on each “docker service create” command) on all Docker nodes in the cluster, whether or not the node is currently running the task for the service:

  • etcd - 2380, 2379, 7001, 4001
  • MySQL - 3306
  • HTTP - 80

If we connect directly to the PublishedPort, with a simple loop, we can see that the MySQL service is load balanced among containers:

[docker1]$ while true; do mysql -uroot -pmypassword -h127.0.0.1 -P3306 -NBe 'select @@wsrep_node_address'; sleep 1; done
10.255.0.10
10.255.0.8
10.255.0.9
10.255.0.10
10.255.0.8
10.255.0.9
10.255.0.10
10.255.0.8
10.255.0.9
10.255.0.10
^C

 At the moment, Swarm manager manages the load balancing internally and there is no way to configure the load balancing algorithm. We can then use external load balancers to route outside traffic to these Docker nodes. In case of any of the Docker nodes goes down, the service will be relocated to the other available nodes.

**Warning: The setup shown in this page is just a proof of concept. It may be incomplete for production use and does not cover a full single point of failure (SPOF) setup.

That’s all for now. In the next blog post, we’ll take a deeper look at Docker overlay network drivers for MySQL containers.

Load balanced MySQL Galera setup - Manual Deployment vs ClusterControl

$
0
0

If you have deployed databases with high availability before, you will know that a deployment does not always go your way, even though you’ve done it a zillion times. You could spend a full day setting everything up and may still end up with a non-functioning cluster. It is not uncommon to start over, as it’s really hard to figure out what went wrong.

So, deploying a MySQL Galera Cluster with redundant load balancing takes a bit of time. This blog looks at how long time it would take to do it manually vs using ClusterControl to perform the task. For those who have not used it before, ClusterControl is an agentless management and automation software for databases. It supports MySQL (Oracle and Percona server), MariaDB, MongoDB (MongoDB inc. and Percona), and PostgreSQL.

For manual deployment, we’ll be using the popular “Google university” to search for how-to’s and blogs that provide deployment steps.

Database Deployment

Deployment of a database consists of several parts. These include getting the hardware ready, software installation, configuration tweaking and a bit of tuning and testing. Now, let’s assume the hardware is ready, the OS is installed and it is up to you to do the rest. We are going to deploy a three-node Galera cluster as shown in the following diagram:

Manual

Googling on “install mysql galera cluster” led us to this page. By following the steps explained plus some additional dependencies, the following is what we should run on every DB node:

$ semanage permissive -a mysqld_t
$ systemctl stop firewalld
$ systemctl disable firewalld
$ vim /etc/yum.repos.d/galera.repo # setting up Galera repository
$ yum install http://www.percona.com/downloads/percona-release/redhat/0.1-3/percona-release-0.1-3.noarch.rpm
$ yum install mysql-wsrep-5.6 galera3 percona-xtrabackup
$ vim /etc/my.cnf # setting up wsrep_* variables
$ systemctl start mysql --wsrep-new-cluster # ‘systemctl start mysql’ on the remaining nodes
$ mysql_secure_installation

The above commands took around 18 minutes to finish on each DB node. Total deployment time was 54 minutes.

ClusterControl

Using ClusterControl, here are the steps we took to first install ClusterControl (5 minutes):

$ wget http://severalnines.com/downloads/cmon/install-cc
$ chmod 755 install-cc
$ ./install-cc

Login to the ClusterControl UI and create the default admin user.

Setup passwordless SSH to all DB nodes on ClusterControl node (1 minute):

$ ssh-keygen -t rsa
$ ssh-copy-id 10.0.0.217
$ ssh-copy-id 10.0.0.218
$ ssh-copy-id 10.0.0.219

In the ClusterControl UI, go to Create Database Cluster -> MySQL Galera and enter the following details (4 minutes):

Click Deploy and wait until the deployment finishes. You can monitor the deployment progress under ClusterControl -> Settings -> Cluster Jobs and once deployed, you will notice it took around 15 minutes:

To sum it up, the total deployment time including installing ClusterControl is 15 + 4 + 1 + 5 = 25 minutes.

Following table summarizes the above deployment actions:

AreaManualClusterControl
Total steps8 steps x 3 servers + 1 = 258
Duration18 x 3 = 54 minutes25 minutes

To summarize, we needed less steps and less time with ClusterControl to achieve the same result. 3 node is sort of a minimum cluster size, and the difference would get bigger with clusters with more nodes.

Load Balancer and Virtual IP Deployment

Now that we have our Galera cluster running, the next thing is to add a load balancer in front. This provides one single endpoint to the cluster, thus reducing the complexity for applications to connect to a multi-node system. Applications would not need to have knowledge of the topology and any changes caused by failures or admin maintenance would be masked. For fault tolerance, we would need at least 2 load balancers with a virtual IP address.

By adding a load balancer tier, our architecture will look something like this:

Manual Deployment

Googling on “install haproxy virtual ip galera cluster” led us to this page. We followed the steps:

On each HAproxy node (2 times):

$ yum install epel-release
$ yum install haproxy keepalived
$ systemctl enable haproxy
$ systemctl enable keepalived
$ vi /etc/haproxy/haproxy.cfg # configure haproxy
$ systemctl start haproxy
$ vi /etc/keepalived/keepalived.conf # configure keepalived
$ systemctl start keepalived

On each DB node (3 times):

$ wget https://raw.githubusercontent.com/olafz/percona-clustercheck/master/clustercheck
$ chmod +x clustercheck
$ mv clustercheck /usr/bin/
$ vi /etc/xinetd.d/mysqlchk # configure mysql check user
$ vi /etc/services # setup xinetd port
$ systemctl start xinetd
$ mysql -uroot -p
mysql> GRANT PROCESS ON *.* TO 'clustercheckuser'@'localhost' IDENTIFIED BY 'clustercheckpassword!'

The total deployment time for this was around 42 minutes.

ClusterControl

For the ClusterControl host, here are the steps taken (1 minute) :

$ ssh-copy-id 10.0.0.229
$ ssh-copy-id 10.0.0.230

Then, go to ClusterControl -> select the database cluster -> Add Load Balancer and enter the IP address of the HAproxy hosts, one at a time:

Once both HAProxy are deployed, we can add Keepalived to provide a floating IP address and perform failover:

Go to ClusterControl -> select the database cluster -> Logs -> Jobs. The total deployment took about 5 minutes, as shown in the screenshot below:

Thus, total deployment for load balancers plus virtual IP address and redundancy is 1 + 5 = 6 minutes.

Following table summarized the above deployment actions:

AreaManualClusterControl
Total steps(8 x 2 haproxy nodes) + (8 x 3 DB nodes) = 406
Duration42 minutes6 minutes

ClusterControl also manages and monitors the load balancers:

Adding a Read Replica

Our setup is now looking pretty decent, and the next step is to add a read replica to Galera. What is a read replica, and why do we need it? A read replica is an asynchronous slave, replicating from one of the Galera nodes using standard MySQL replication. There are a few good reasons to have this. Long-running reporting/OLAP type queries on a Galera node might slow down an entire cluster, if the reporting load is so intensive that the node has to spend considerable effort coping with it. So reporting queries can be sent to a standalone server, effectively isolating Galera from the reporting load. An asynchronous slave can also serve as a remote live backup of our cluster in a DR site, especially if the link is not good enough to stretch one cluster across 2 sites.

Our architecture is now looking like this:

Manual Deployment

Googling on “mysql galera with slave” brought us to this page. We followed the steps:

On master node:

$ vim /etc/my.cnf # setting up binary log and gtid
$ systemctl restart mysql
$ mysqldump --single-transaction --skip-add-locks --triggers --routines --events > dump.sql
$ mysql -uroot -p
mysql> GRANT REPLICATION SLAVE ON .. ;

On slave node (we used Percona Server):

$ yum install http://www.percona.com/downloads/percona-release/redhat/0.1-3/percona-release-0.1-3.noarch.rpm
$ yum install Percona-Server-server-56
$ vim /etc/my.cnf # setting up server id, gtid and stuff
$ systemctl start mysql
$ mysql_secure_installation
$ scp root@master:~/dump.sql /root
$ mysql -uroot -p < /root/dump.sql
$ mysql -uroot -p
mysql> CHANGE MASTER ... MASTER_AUTO_POSITION=1;
mysql> START SLAVE;

The total time spent for this manual deployment was around 40 minutes (with 1GB database in size).

ClusterControl

With ClusterControl, here is what we should do. Firstly, configure passwordless SSH to the target slave (0.5 minute):

$ ssh-copy-id 10.0.0.231 # setup passwordless ssh

Then, on one of the MySQL Galera nodes, we have to enable binary logging to become a master (2 minutes):

Click Proceed to start enabling binary log for this node. Once completed, we can add the replication slave by going to ClusterControl -> choose the Galera cluster -> Add Replication Slave and specify as per below (6 minutes including streaming 1GB of database to slave):

Click “Add node” and you are set. Total deployment time for adding a read replica complete with data is 6 + 2 + 0.5 = 8.5 minutes.

Following table summarized the above deployment actions:

AreaManualClusterControl
Total steps153
Duration40 minutes8.5 minutes

We can see that ClusterControl automates a number of time consuming tasks, including slave installation, backup streaming and slaving from master. Note that ClusterControl will also handle things like master failover so that replication does not break if the galera master fails..

Conclusion

A good deployment is important, as it is the foundation of an upcoming database workload. Speed matters too, especially in agile environments where a team frequently deploys entire systems and tears them down after a short time. You’re welcome to try ClusterControl to automate your database deployments, it comes with a free 30-day trial of the full enterprise features. Once the trial ends, it will default to the community edition (free forever).

High Availability on a Shoestring Budget - Deploying a Minimal Two Node MySQL Galera Cluster

$
0
0

We regularly get questions about how to set up a Galera cluster with just 2 nodes. The documentation clearly states you should have at least 3 Galera nodes to avoid network partitioning. But there are some valid reasons for considering a 2 node deployment, e.g., if you want achieve database high availability but have limited budget to spend on a third database node. Or perhaps you are running Galera in a development/sandbox environment and prefer a minimal setup.

Galera implements a quorum-based algorithm to select a primary component through which it enforces consistency. The primary component needs to have a majority of votes, so in a 2 node system, there would be no majority resulting in split brain. Fortunately, it is possible to add a garbd (Galera Arbitrator Daemon), which is a lightweight stateless daemon that can act as the odd node. Arbitrator failure does not affect the cluster operations and a new instance can be reattached to the cluster at any time. There can be several arbitrators in the cluster.

ClusterControl has support for deploying garbd on non-database hosts.

Normally a Galera cluster needs at least three hosts to be fully functional, however at deploy time two nodes would suffice to create a primary component. Here are the steps:

  1. Deploy a Galera cluster of two nodes,
  2. After the cluster has been deployed by ClusterControl, add garbd on the ClusterControl node.

You should end up with the below setup:

Deploy the Galera Cluster

Go to the ClusterControl deploy wizard to deploy the cluster.

Even though ClusterControl warns you a Galera cluster needs an odd number of nodes, only add two nodes to the cluster.

Deploying a Galera cluster will trigger a ClusterControl job which can be monitored at the Jobs page.

Install Garbd

Once deployment is complete, install garbd on the ClusterControl host. It will be under the Manage -> Load Balancer:

Installing garbd will trigger a ClusterControl job which can be monitored at the Jobs page. Once completed, you can verify garbd is running with a green tick icon at the top bar:

That’s it. Our minimal two-node Galera cluster is now ready!

Deploying and Monitoring MySQL and MongoDB clusters in the cloud with NinesControl

$
0
0

NinesControl is a new service from Severalnines which helps you deploy MySQL Galera and MongoDB clusters in the cloud. In this blog post we will show you how you can easily deploy and monitor your databases on AWS and DigitalOcean.

Deployment

At the moment of writing, NinesControl supports two cloud providers - Amazon Web Services and DigitalOcean. Before you attempt to deploy, you need first to configure access credentials to the cloud you’d like to run on. We covered this topic in a blog post.

Once it’s done, you should see in the “Cloud Accounts” tab the credentials defined for the chosen cloud provider.

You’ll see screen below as you do not have any clusters running yet:

You can click on “Deploy your first cluster” to start your first deployment. You will be presented with a screen like below - you can pick the cluster type you’d like to deploy, set some configuration settings like port, data directory and password. You can also set number of nodes in the cluster and which database vendor you’d like to use.

For MongoDB, the deployment screen is fairly similar with some additional settings to configure.

Once you are done here, it’s time to move to the second step - picking credentials to use to deploy your cluster. You have an option to pick either DigitalOcean and Amazon Web Services. You can also pick whatever credentials you have added to NinesControl. In our example, we just have a single credential but it’s perfectly ok to have more than one credential per cloud provider.

Once you’ve made your choice, proceed to the third, final step in which you will pick what kind of VM’s you’d like to use. This screen differs between AWS and DigitalOcean.

If you picked AWS, you will have an option to choose the operating system and VM size. You also need to pick the VPC in which you will deploy too and subnet which will be used by your cluster. If you don’t see anything on the drop-down list, you can click on “[Add]” buttons to create both VPC and subnet and NinesControl will create these for you. Finally, you need to set the volume size of the VMs. After that, you can trigger the deployment.

DigitalOcean uses a bit different screen setup but the idea is similar - you need to pick a region, operating system and a size of droplet.

Once you are done, click on “Deploy cluster” to start deployment.

Status of the deployment will be show in the cluster list. You can also click on a status bar to see full log of a deployment. Whenever you’d like to deploy a new cluster, you will have to click on the “Deploy cluster” button.

Monitoring

Once deployment completes, you’ll see a list of your clusters.

When you click on one of them, you’ll see a list of nodes in the cluster and cluster-wide metrics.

Of course, metrics are cluster-dependent. Above is what you will see on a MySQL/MariaDB Galera cluster. MongoDB will present you different graphs and metrics:

When you click on a node, you will be redirected to host statistics of that particular node - CPU, network, disk, RAM usage - all of those very important basics which tell you about node health:

As you can see, NinesControl not only allows you to deploy Galera and MongoDB clusters in a fast and efficient way but it also collects important metrics for you and shows them as graphs.

Give it a try and let us know what you think.

MySQL on Docker: Deploy a Homogeneous Galera Cluster with etcd

$
0
0

In the previous blog post, we have looked into the multi-host networking capabilities with Docker with native network and Calico. In this blog post, our journey to make Galera Cluster run smoothly on Docker containers continues. Deploying Galera Cluster on Docker is tricky when using orchestration tools. Due to the nature of the scheduler in container orchestration tools and the assumption of homogenous images, the scheduler will just fire the respective containers according to the run command and leave the bootstrapping process to the container’s entrypoint logic when starting up. And you do not want to do that for Galera - starting all nodes at once means each node will form a “1-node cluster” and you’ll end up with a disjointed system.

“Homogeneousing” Galera Cluster

That might be a new word, but it holds true for stateful services like MySQL Replication and Galera Cluster. As one might know, the bootstrapping process for Galera Cluster usually requires manual intervention, where you usually have to decide which node is the most advanced node to start bootstrapping from. There is nothing wrong with this step, you need to be aware of the state of each database node before deciding on the sequence of how to start them up. Galera Cluster is a distributed system, and its redundancy model works like that.

However, container orchestration tools like Docker Engine Swarm Mode and Kubernetes are not aware of the redundancy model of Galera. The orchestration tool presumes containers are independent from each other. If they are dependent, then you have to have an external service that monitors the state. The best way to achieve this is to use a key/value store as a reference point for other containers when starting up.

This is where service discovery like etcd comes into the picture. The basic idea is, each node should report its state periodically to the service. This simplifies the decision process when starting up. For Galera Cluster, the node that has wsrep_local_state_comment equal to Synced shall be used as a reference node when constructing the Galera communication address (gcomm) during joining. Otherwise, the most updated node has to get bootstrapped first.

Etcd has a very nice feature called TTL, where you can expire a key after a certain amount of time. This is useful to determine the state of a node, where the key/value entry only exists if an alive node reports to it. As a result, the node won’t have to connect to each other to determine state (which is very troublesome in a dynamic environment) when forming a cluster. For example, consider the following keys:

    {"createdIndex": 10074,"expiration": "2016-11-29T10:55:35.218496083Z","key": "/galera/my_wsrep_cluster/10.255.0.7/wsrep_last_committed","modifiedIndex": 10074,"ttl": 10,"value": "2881"
    },
    {"createdIndex": 10072,"expiration": "2016-11-29T10:55:34.650574629Z","key": "/galera/my_wsrep_cluster/10.255.0.7/wsrep_local_state_comment","modifiedIndex": 10072,"ttl": 10,"value": "Synced"
    }

After 10 seconds (ttl value), those keys will be removed from the entry. Basically, all nodes should report to etcd periodically with an expiring key. Container should report every N seconds when it's alive (wsrep_cluster_state_comment=Synced and wsrep_last_committed=#value) via a background process. If a container is down, it will no longer send the update to etcd, thus the keys are removed after expiration. This simply indicates that the node was registered but is no longer synced with the cluster. It will be skipped when constructing the Galera communication address at a later point.

The overall flow of joining procedure is illustrated in the following flow chart:

We have built a Docker image that follows the above. It is specifically built for running Galera Cluster using Docker’s orchestration tool. It is available at Docker Hub and our Github repository. It requires an etcd cluster as the discovery service (supports multiple etcd hosts) and based on Percona XtraDB Cluster 5.6. The image includes Percona Xtrabackup, jq (JSON processor) and also a shell script tailored for Galera health check called report_status.sh.

You are welcome to fork or contribute to the project. Any bugs can be reported via Github or via our support page.

Deploying etcd Cluster

etcd is a distributed key value store that provides a simple and efficient way to store data across a cluster of machines. It’s open-source and available on GitHub. It provides shared configuration and service discovery. A simple use-case is to store database connection details or feature flags in etcd as key value pairs. It gracefully handles leader elections during network partitions and will tolerate machine failures, including the leader.

Since etcd is the brain of the setup, we are going to deploy it as a cluster daemon, on three nodes, instead of using containers. In this example, we are going to install etcd on each of the Docker hosts and form a three-node etcd cluster for better availability.

We used CentOS 7 as the operating system, with Docker v1.12.3, build 6b644ec. The deployment steps in this blog post are basically similar to the one used in our previous blog post.

  1. Install etcd packages:

    $ yum install etcd
  2. Modify the configuration file accordingly depending on the Docker hosts:

    $ vim /etc/etcd/etcd.conf

    For docker1 with IP address 192.168.55.111:

    ETCD_NAME=etcd1
    ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
    ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
    ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
    ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.55.111:2380"
    ETCD_INITIAL_CLUSTER="etcd1=http://192.168.55.111:2380,etcd2=http://192.168.55.112:2380,etcd3=http://192.168.55.113:2380"
    ETCD_INITIAL_CLUSTER_STATE="new"
    ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-1"
    ETCD_ADVERTISE_CLIENT_URLS="http://0.0.0.0:2379"

    For docker2 with IP address 192.168.55.112:

    ETCD_NAME=etcd2
    ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
    ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
    ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
    ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.55.112:2380"
    ETCD_INITIAL_CLUSTER="etcd1=http://192.168.55.111:2380,etcd2=http://192.168.55.112:2380,etcd3=http://192.168.55.113:2380"
    ETCD_INITIAL_CLUSTER_STATE="new"
    ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-1"
    ETCD_ADVERTISE_CLIENT_URLS="http://0.0.0.0:2379"

    For docker3 with IP address 192.168.55.113:

    ETCD_NAME=etcd3
    ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
    ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
    ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
    ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.55.113:2380"
    ETCD_INITIAL_CLUSTER="etcd1=http://192.168.55.111:2380,etcd2=http://192.168.55.112:2380,etcd3=http://192.168.55.113:2380"
    ETCD_INITIAL_CLUSTER_STATE="new"
    ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-1"
    ETCD_ADVERTISE_CLIENT_URLS="http://0.0.0.0:2379"
  3. Start the service on docker1, followed by docker2 and docker3:

    $ systemctl enable etcd
    $ systemctl start etcd
  4. Verify our cluster status using etcdctl:

    [docker3 ]$ etcdctl cluster-health
    member 2f8ec0a21c11c189 is healthy: got healthy result from http://0.0.0.0:2379
    member 589a7883a7ee56ec is healthy: got healthy result from http://0.0.0.0:2379
    member fcacfa3f23575abe is healthy: got healthy result from http://0.0.0.0:2379
    cluster is healthy

That’s it. Our etcd is now running as a cluster on three nodes. The below illustrates our architecture:

Deploying Galera Cluster

Minimum of 3 containers is recommended for high availability setup. Thus, we are going to create 3 replicas to start with, it can be scaled up and down afterwards. Running standalone is also possible with standard "docker run" command as shown further down.

Before we start, it’s a good idea to remove any sort of keys related to our cluster name in etcd:

$ etcdctl rm /galera/my_wsrep_cluster --recursive

Ephemeral Storage

This is a recommended way if you plan on scaling the cluster out on more nodes (or scale back by removing nodes). To create a three-node Galera Cluster with ephemeral storage (MySQL datadir will be lost if the container is removed), you can use the following command:

$ docker service create \
--name mysql-galera \
--replicas 3 \
-p 3306:3306 \
--network galera-net \
--env MYSQL_ROOT_PASSWORD=mypassword \
--env DISCOVERY_SERVICE=192.168.55.111:2379,192.168.55.112:2379,192.168.55.113:2379 \
--env XTRABACKUP_PASSWORD=mypassword \
--env CLUSTER_NAME=my_wsrep_cluster \
severalnines/pxc56

Persistent Storage

To create a three-node Galera Cluster with persistent storage (MySQL datadir persists if the container is removed), add the mount option with type=volume:

$ docker service create \
--name mysql-galera \
--replicas 3 \
-p 3306:3306 \
--network galera-net \
--mount type=volume,source=galera-vol,destination=/var/lib/mysql \
--env MYSQL_ROOT_PASSWORD=mypassword \
--env DISCOVERY_SERVICE=192.168.55.111:2379,192.168.55.112:2379,192.168.55.113:2379 \
--env XTRABACKUP_PASSWORD=mypassword \
--env CLUSTER_NAME=my_wsrep_cluster \
severalnines/pxc56

Custom my.cnf

If you would like to include a customized MySQL configuration file, create a directory on the physical host beforehand:

$ mkdir /mnt/docker/mysql-config # repeat on all Docker hosts

Then, use the mount option with “type=bind” to map the path into the container. In the following example, the custom my.cnf is located at /mnt/docker/mysql-config/my-custom.cnf on each Docker host:

$ docker service create \
--name mysql-galera \
--replicas 3 \
-p 3306:3306 \
--network galera-net \
--mount type=volume,source=galera-vol,destination=/var/lib/mysql \
--mount type=bind,src=/mnt/docker/mysql-config,dst=/etc/my.cnf.d \
--env MYSQL_ROOT_PASSWORD=mypassword \
--env DISCOVERY_SERVICE=192.168.55.111:2379,192.168.55.112:2379,192.168.55.113:2379 \
--env XTRABACKUP_PASSWORD=mypassword \
--env CLUSTER_NAME=my_wsrep_cluster \
severalnines/pxc56

Wait for a couple of minutes and verify the service is running (CURRENT STATE = Running):

$ docker service ls mysql-galera
ID                         NAME            IMAGE               NODE           DESIRED STATE  CURRENT STATE           ERROR
2vw40cavru9w4crr4d2fg83j4  mysql-galera.1  severalnines/pxc56  docker1.local  Running        Running 5 minutes ago
1cw6jeyb966326xu68lsjqoe1  mysql-galera.2  severalnines/pxc56  docker3.local  Running        Running 12 seconds ago
753x1edjlspqxmte96f7pzxs1  mysql-galera.3  severalnines/pxc56  docker2.local  Running        Running 5 seconds ago

External applications/clients can connect to any Docker host IP address or hostname on port 3306, requests will be load balanced between the Galera containers. The connection gets NATed to a Virtual IP address for each service "task" (container, in this case) using the Linux kernel's built-in load balancing functionality, IPVS. If the application containers reside in the same overlay network (galera-net), then use the assigned virtual IP address instead. You can retrieve it using the inspect option:

$ docker service inspect mysql-galera -f "{{ .Endpoint.VirtualIPs }}"
[{89n5idmdcswqqha7wcswbn6pw 10.255.0.2/16} {1ufbr56pyhhbkbgtgsfy9xkww 10.0.0.2/24}]

Our architecture is now looking like this:

As a side note, you can also run Galera in standalone mode. This is probably useful for testing purposes like backup and restore, testing the impact of queries and so on. To run it just like a standalone MySQL container, use the standard docker run command:

$ docker run -d \
-p 3306 \
--name=galera-single \
-e MYSQL_ROOT_PASSWORD=mypassword \
-e DISCOVERY_SERVICE=192.168.55.111:2379,192.168.55.112:2379,192.168.55.113:2379 \
-e CLUSTER_NAME=my_wsrep_cluster \
-e XTRABACKUP_PASSWORD=mypassword \
severalnines/pxc56
ClusterControl
Single Console for Your Entire Database Infrastructure
Deploy, manage, monitor, scale your databases on the technology stack of your choice!

Scaling the Cluster

There are two ways you can do scaling:

  1. Use “docker service scale” command.
  2. Create a new service with same CLUSTER_NAME using “docker service create” command.

Docker’s “scale” Command

The scale command enables you to scale one or more services either up or down to the desired number of replicas. The command will return immediately, but the actual scaling of the service may take some time. Galera needs to be run an odd number of nodes to avoid network partitioning.

So a good number to scale to would be 5 and so on:

$ docker service scale mysql-galera=5

Wait for a couple of minutes to let the new containers reach the desired state. Then, verify the running service:

$ docker service ls
ID            NAME          REPLICAS  IMAGE               COMMAND
bwvwjg248i9u  mysql-galera  5/5       severalnines/pxc56

One drawback of using this method is that you have to use ephemeral storage because Docker will likely schedule the new containers on a Docker host that already has a Galera container running. If this happens, the volume will overlap the existing Galera containers’ volume. If you would like to use persistent storage and scale in Docker Swarm mode, you should create another new service with a couple of different options, as described in the next section.

At this point, our architecture looks like this:

Another Service with Same Cluster Name

Another way to scale is to create another service with the same CLUSTER_NAME and network. However, you can’t really use the exact same command as the first one due to the following reasons:

  • The service name should be unique.
  • The port mapping must be other than 3306, since this port has been assigned to the mysql-galera service.
  • The volume name should be different to distinguish them from the existing Galera containers.

A benefit of doing this is you will got another virtual IP address assigned to the “scaled” service. This allows you to have an additional option for your application or client to connect to the “scaled” IP address for various tasks, e.g. perform a full backup in desync mode, database consistency check or server auditing.

The following example shows the command to add two more nodes to the cluster in a new service called mysql-galera-scale:

$ docker service create \
--name mysql-galera-scale \
--replicas 2 \
-p 3307:3306 \
--network galera-net \
--mount type=volume,source=galera-scale-vol,destination=/var/lib/mysql \
--env MYSQL_ROOT_PASSWORD=mypassword \
--env DISCOVERY_SERVICE=192.168.55.111:2379,192.168.55.112:2379,192.168.55.113:2379 \
--env XTRABACKUP_PASSWORD=mypassword \
--env CLUSTER_NAME=my_wsrep_cluster \
severalnines/pxc56

If we look into the service list, here is what we see:

$ docker service ls
ID            NAME                REPLICAS  IMAGE               COMMAND
0ii5bedv15dh  mysql-galera-scale  2/2       severalnines/pxc56
71pyjdhfg9js  mysql-galera        3/3       severalnines/pxc56

And when you look into the cluster size on one of the container, you should get 5:

[root@docker1 ~]# docker exec -it $(docker ps | grep mysql-galera | awk {'print $1'}) mysql -uroot -pmypassword -e 'show status like "wsrep_cluster_size"'
Warning: Using a password on the command line interface can be insecure.
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 5     |
+--------------------+-------+

At this point, our architecture looks like this:

To get a clearer view of the process, we can simply look at the MySQL error log file (located under Docker’s data volume) on one of the running containers, for example:

$ tail -f /var/lib/docker/volumes/galera-vol/_data/error.log

Scale Down

Scaling down is simple. Just reduce the number of replicas or remove the service that holds the minority number of containers to ensure that Galera is still in quorum. For example, if you have fired two groups of nodes with 3 + 2 containers and reach total of 5, the majority need to survive thus you can only remove the second group with 2 containers. If you have three groups with 3 + 2 + 2 containers, you can lose a maximum of 3 containers. This is due to the fact that the Docker Swarm scheduler simply terminates and removes the containers corresponding to the service. This makes Galera think that there are nodes failing, as they are not shut down in a graceful way.

If you scaled up using “docker service scale” command, you should scale down using the same method by reducing the number of replicas. To scale it down, simply do:

$ docker service scale mysql-galera=3

Otherwise, if you chose to create another service to scale up, then simply remove the respective service to scale down:

$ docker service rm mysql-galera-scale

Known Limitations

There will be no automatic recovery if a split-brain happens (where all nodes are in Non-Primary state). This is because the MySQL service is still running, yet it will refuse to serve any data and will return error to the client. Docker has no capability to detect this since what it cares about is the foreground MySQL process which is not terminated, killed or stopped. Automating this process is risky, especially if the service discovery is co-located with the Docker host (etcd would also lose contact with other members). Although if the service discovery is healthy somewhere else, it is probably unreachable from the Galera containers perspective, preventing each other to see the container’s status correctly during the glitch.

In this case, you will need to intervene manually.

Choose the most advanced node to bootstrap and then run the following command to promote the node as Primary (other nodes shall then rejoin automatically if the network recovers):

$ docker exec -it [container ID] mysql -uroot -pyoursecret -e 'set global wsrep_provider_option="pc.bootstrap=1"'

Also, there is no automatic cleanup for the discovery service registry. You can remove all entries using either the following command (assuming the CLUSTER_NAME is my_wsrep_cluster):

$ curl http://192.168.55.111:2379/v2/keys/galera/my_wsrep_cluster?recursive=true -XDELETE # or
$ etcdctl rm /galera/my_wsrep_cluster --recursive

Conclusion

This combination of technologies opens a door for a more reliable database setup in the Docker ecosystem. Working with service discovery to store state makes it possible to have stateful containers to achieve a homogeneous setup.

In the next blog post, we are going to look into how to manage Galera Cluster on Docker.

Online schema change for MySQL & MariaDB - comparing GitHub’s gh-ost vs pt-online-schema-change

$
0
0

Database schema change is one of the most common activities that a MySQL DBA has to tackle. No matter if you use MySQL Replication or Galera Cluster, direct DDL’s are troublesome and, sometimes, not feasible to execute. Add the requirement to perform the change while all databases are online, and it can get pretty daunting.

Thankfully, online schema tools are there to help DBAs deal with this problem. Arguably, the most popular of them is Percona’s pt-online-schema-change, which is part of Percona Toolkit.

It has been used by MySQL DBAs for years and is proven as a flexible and reliable tool. Unfortunately, not without drawbacks.

To understand these, we need to understand how it works internally.

How does pt-online-schema-change work?

Pt-online-schema-change works in a very simple way. It creates a temporary table with the desired new schema - for instance, if we added an index, or removed a column from a table. Then, it creates triggers on the old table - those triggers are there to mirror changes that happen on the original table to the new table. Changes are mirrored during the schema change process. If a row is added to the original table, it is also added to the new one. Likewise if a row is modified or deleted on the old table, it is also applied on the new table. Then, a background process of copying data (using LOW_PRIORITY INSERT) between old and new table begins. Once data has been copied, RENAME TABLE is executed to rename “yourtable” into “yourtable_old” and “yourtable_new” into “yourtable”. This is an atomic operation and in case something goes wrong, it is possible to recover the old table.

The process described above has some limitations. For starters, it is not possible to reduce the overhead of the tool to 0. Pt-online-schema-change gives you an option to define the maximum allowed replication lag and, if that threshold is crossed, it stops to copy data between the old and new table. It is also possible to pause the background process entirely. The problem is that we are talking only about the background process of running INSERTs. It is not possible to reduce the overhead caused by the fact that every operation in “yourtable” is duplicated in “yourtable_new” through triggers. If you remove the triggers, the old and new table would go out of sync without any means to sync them again. Therefore, when you run pt-online-schema-change on your system, it always adds some overhead, even if it is paused or throttled. How big overhead depends on how many writes hit the table which is undergoing a schema change.

Another issue is caused again by triggers - precisely by the fact that, to create triggers, one has to acquire a lock on MySQL’s metadata. This can become a serious problem if you have highly concurrent traffic or if you use longer transactions. Under such load, it may be virtually impossible (and we’ve seen such databases) to use pt-online-schema-change due to the fact that it is not able to acquire metadata lock to create the required triggers. Additionally, the process of acquiring metadata can also lock further transactions, basically grinding all database operations to halt.

Yet another problem are foreign keys - unfortunately, there is no simple way of handling them. Pt-online-schema-change gives you two methods to approach this issue. Neither of those are really good. The main issue here is that a foreign key of a given name can only refer to a single table and it sticks to it - even if you rename the table referred to, the foreign key will follow this change. This leads to the problem: after RENAME TABLE, the foreign key will point to ‘yourtable_old’, not ‘yourtable’.

One workaround is to not use:

RENAME TABLE ‘yourtable’ TO ‘yourtable_old’, ‘yourtable_new’ TO ‘yourtable’;

Instead, use a two step approach:

DROP TABLE ‘yourtable’; RENAME TABLE ‘yourtable_new’ TO ‘yourtable’;

This poses a serious problem. If for some reason, RENAME TABLE won’t work, there’s no going back as the original table has been already dropped.

Another approach would be to create a second foreign key, under a different name, which refers to ‘yourtable_new’. After RENAME TABLE, it will point to ‘yourtable’, which is exactly what we want. Thing is, you need to execute a direct ALTER to create such foreign key - which kind of invalidates the point of using online schema change - to avoid direct alters. If the altered table is large, such operation is not feasible to execute on Galera Cluster (cluster-wide stall caused by TOI) and MySQL replication cluster (slave lag induced by serialized ALTER).

As you can see, while being a useful tool, pt-online-schema-change has serious limitations which you need to be aware of before you use it. If you use MySQL at scale, limitations may become a serious motivation to do something about it.

Introducing GitHub’s gh-ost

Motivation alone is not enough - you also need resources to create a new solution. GitHub recently released gh-ost, their take on online schema change. Let’s take a look at how it compares to Percona’s pt-online-schema-change and how it can be used to avoid some of its limitations.

To understand better what is the difference between both tools, let’s take a look at how gh-ost works.

Gh-ost creates a temporary table with the altered schema, just like pt-online-schema-change does - it uses “_yourtable_gho” pattern. It executes INSERT queries which use the following pattern to copy data from old to new table:

insert /* gh-ost `sbtest1`.`sbtest1` */ ignore into `sbtest1`.`_sbtest1_gho` (`id`, `k`, `c`, `pad`)
      (select `id`, `k`, `c`, `pad` from `sbtest1`.`sbtest1` force index (`PRIMARY`)
        where (((`id` > ?)) and ((`id` < ?) or ((`id` = ?)))) lock in share mode

As you can see, it is a variation of INSERT INTO new_table  SELECT * FROM old_table. It uses primary key to split data in chunks and then work on them.

In pt-online-schema-change, the current traffic was handled using triggers. Gh-ost uses a triggerless approach - it uses binary logs to track and apply changes which happened since gh-ost started to copy data. It connects to one of the hosts, by default it is one of the slaves, simulates that it is a slave itself and asks for binary logs.

This behavior has a couple of repercussions. First of all, network traffic is increased compared to pt-online-schema-change - not only gh-ost has to copy data but it also has to copy binary logs.

It also requires binary logs in row-based format for full data consistency - if you use statement or mixed replication, gh-ost won’t work in your setup. As a workaround, you can create a new slave, enable log_slave_updates and set it to store events in row format. Reading data from a slave is, actually, the default way in which gh-ost operates - it makes perfect sense as pulling binary logs adds some overhead and if you can avoid additional overhead on the master, you most likely want to do it. Of course, if your master uses row-based replication format, you can force gh-ost to connect to it and get binary logs.

What is good about this design is that you don’t have to create triggers, which, as we discussed, could become a serious problem or even a blocker. What is also great is that you can always stop parsing binary logs - it’s like you’d just run STOP SLAVE. You have the binlog coordinates so you can easily start in the same position later on. This makes it possible to stop practically all operations executed by gh-ost. Not only the background process of copying data from old to new table, but also any load related to keeping the new table in sync with the old one. This is a great feature in a production environment - pt-online-schema-change requires constant monitoring as you can only estimate the additional load on the system. Even if you paused it, it will still add some overhead and, under heavy load, this overhead may result in an unstable database. On the other hand, with gh-ost, you can just pause the whole process and the workload pattern will go back to what you are used to see - no additional load whatsoever related to the schema change. This is really great - it means you can start the migration at 9am, when you start your day, stop it at 5pm when you are leaving your office. You can be sure that you won’t get paged late at night because the paused schema change process is not actually 100% paused, and is causing problems to your production systems.

Unfortunately, gh-ost is not without drawbacks. For starters, foreign keys. Pt-online-schema-change does not provide any good way of altering tables which contain foreign keys. It is still way better than gh-ost as gh-ost does not support foreign keys at all. At the moment of writing, that is - it may change in the future. Triggers - gh-ost, at the moment of writing, does not support triggers at all. The same is true for pt-online-schema-change - it was a limitation of pre-5.7 MySQL where you couldn’t have more than one trigger of a given type defined in a table (and pt-online-schema-change had to create them for its own purposes). Even if the limitation is removed in MySQL 5.7, pt-online-schema-change still does not support tables with triggers.

One of the main limitations of gh-ost is, definitely, the fact that it does not support Galera Cluster. It is because of how gh-ost performs a table switch - it uses LOCK TABLE which do not work well with Galera - as of now there is no known fix or workaround for this issue and this leaves pt-online-schema-change as the only option for Galera Cluster.

These are probably the most important  limitations of gh-ost, but there are more of them. Minimal row image is not supported (which makes your binlogs grow larger), JSON and generated columns in 5.7 are not supported. Migration key must not contain NULL values, there are limitations when it comes to mixed cases in table names. You can find more details on all requirements and limitations of gh-ost in its documentation.

In our next blog post we will take a look at how gh-ost operates, how you can test your changes and how to perform it. We will also discuss throttling of gh-ost.


Watch the evolution of ClusterControl for MySQL & MongoDB

$
0
0

ClusterControl reduces complexity of managing your database infrastructure on premise or in the cloud, while adding support for new technologies; enabling you to truly automate mixed environments for next-level applications.

Since the launch of ClusterControl in 2012, we’ve experienced growth in new industries with customers who are benefiting from the advancements ClusterControl has to offer.

In addition to reaching new highs in ClusterControl demand, this past year we’ve doubled the size of our team allowing us to continue to provide even more improvements to ClusterControl.

Watch this short video to see where ClusterControl stands today.

Video: An Overview of the Features & Functions of ClusterControl

$
0
0

The video below demonstrates the top features and functions included in ClusterControl.  

ClusterControl is an all-inclusive database management system that lets you easily deploy, monitor, manage and scale highly available open source databases on-premise or in the cloud.

Included in this presentation are…

  • Deploying MySQL, MongoDB & PostgreSQL nodes and clusters
  • Overview of the monitoring dashboard
  • Individual node or cluster monitoring
  • Query monitor system
  • Creating and restoring immediate and scheduled backups
  • Configuration management
  • Developer Studio introduction
  • Reviewing log files
  • Scaling database clusters

Demonstration Videos: Top Four Feature Sets of ClusterControl for MySQL, MongoDB & PostgreSQL

$
0
0

The videos below demonstrate the top features and functions included in ClusterControl.  

Deploy

Deploy the best open source database for the job at hand using repeatable deployments with best practice configurations for MySQL, MySQL Cluster, Galera Cluster, Percona, PostgreSQL or MongoDB databases. Reduce time spent on manual provisioning and more time for experimentation and innovation.

Management

Easily handle and automate your day to day tasks uniformly and transparently across a mixed database infrastructure. Automate backups, health checks, database repair/recovery, security and upgrades using battle tested best practices.

Monitoring

Unified and comprehensive real-time monitoring of your entire database and server infrastructure. Gain access to 100+ key database and host metrics that matter to your operational performance. Visualize performance in custom dashboards to establish operational baselines and support capacity planning.

Scaling

Handle unplanned workload changes by dynamically scaling out with more nodes. Optimize resource usage by scaling back nodes.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

MySQL on Docker: Composing the Stack

$
0
0

Docker 1.13 introduces a long-awaited feature called compose-file support, which allow us to define our containers with a nice simple config file instead of a single long command. If you have a look at our previous “MySQL on Docker” blog posts, we used multiple long command lines to run containers and services. By using compose-file, containers are easily specified for deployment. This reduces the risk for human error as you do not have to remember long commands with multiple parameters.

In this blog post, we’ll show you how to use compose-file by using simple examples around MySQL deployments. We assume you have Docker Engine 1.13 installed on 3 physical hosts and Swarm mode is configured on all hosts.

Introduction to Compose-file

In the Compose-file, you specify everything in YAML format as opposed to trying to remember all the arguments we have to pass to Docker commands. You can define services, networks and volumes here. The definition will be picked up by Docker and it is very much like passing command-line parameters to “docker run|network|volume” command.

As an introduction, we are going to deploy a simple standalone MySQL container. Before you start writing a Compose file, you first need to know the run command. Taken from our first MySQL on Docker blog series, let’s compose the following “docker run” command:

$ docker run --detach \
--name=test-mysql \
--publish 6603:3306 \
--env="MYSQL_ROOT_PASSWORD=mypassword" \
-v /storage/docker/mysql-datadir:/var/lib/mysql \
mysql

The docker-compose command will look for a default file called “docker-compose.yml” in the current directory. So, let’s first create the required directories beforehand:

$ mkdir -p ~/compose-files/mysql/single
$ mkdir -p /storage/docker/mysql-datadir
$ cd ~/compose-files/mysql/single

In YAML, here is what should be written:

version: '2'

services:
  mysql:
    image: mysql
    container_name: test-mysql
    ports:
      - 6603:3306
    environment:
      MYSQL_ROOT_PASSWORD: "mypassword"
    volumes:
      - /storage/docker/mysql-datadir:/var/lib/mysql

Save the above content into “~/compose-files/mysql/single/docker-compose.yml”. Ensure you are in the current directory ~/compose-files/mysql/single, then fire it up by running the following command:

$ docker-compose up -d
WARNING: The Docker Engine you're using is running in swarm mode.

Compose does not use swarm mode to deploy services to multiple nodes in a swarm. All containers will be scheduled on the current node.

To deploy your application across the swarm, use `docker stack deploy`.

Creating test-mysql

Verify if the container is running in detached mode:

[root@docker1 single]# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
379d5c15ef44        mysql               "docker-entrypoint..."   8 minutes ago       Up 8 minutes        0.0.0.0:6603->3306/tcp   test-mysql

Congratulations! We have now got a MySQL container running with just a single command.

Deploying a Stack

Compose-file simplifies things, it provides us with a clearer view on how the infrastructure should look like. Let’s create a container stack that consists of a website running on Drupal, using a MySQL instance under a dedicated network and link them together.

Similar to above, let’s take a look at the command line version in the correct order to build this stack:

$ docker volume create mysql_data
$ docker network create drupal_mysql_net --driver=bridge
$ docker run -d --name=mysql-drupal --restart=always -v mysql_data:/var/lib/mysql --net=drupal_mysql_net -e MYSQL_ROOT_PASSWORD="mypassword" -e MYSQL_DATABASE="drupal" mysql
$ docker run -d --name=drupal -p 8080:80 --restart=always -v /var/www/html/modules -v /var/www/html/profiles -v /var/www/html/themes -v /var/www/html/sites --link mysql:mysql --net=drupal_mysql_net drupal

To start composing, let’s first create a directory for our new stack:

$ mkdir -p ~/compose-files/drupal-mysql
$ cd ~/compose-files/drupal-mysql

Then, create write content of docker-compose.yml as per below:

version: '2'

services:
  mysql:
    image: mysql
    container_name: mysql-drupal
    environment:
      MYSQL_ROOT_PASSWORD: "mypassword"
      MYSQL_DATABASE: "drupal"
    volumes:
      - mysql_data:/var/lib/mysql
    restart: always
    networks:
      - drupal_mysql_net

  drupal:
    depends_on:
      - mysql
    image: drupal
    container_name: drupal
    ports:
      - 8080:80
    volumes:
      - /var/www/html/modules
      - /var/www/html/profiles
      - /var/www/html/themes
      - /var/www/html/sites
    links:
      - mysql:mysql
    restart: always
    networks:
      - drupal_mysql_net

volumes:
  mysql_data:

networks:
  drupal_mysql_net:
    driver: bridge

Fire them up:

$ docker-compose up -d
..
Creating network "drupalmysql_drupal_mysql_net" with driver "bridge"
Creating volume "drupalmysql_mysql_data" with default driver
Pulling drupal (drupal:latest)...
..
Creating mysql-drupal
Creating drupal

Docker will perform the deployment as follows:

  1. Create network
  2. Create volume
  3. Pull images
  4. Create mysql-drupal (since container “drupal” is dependent on it)
  5. Create the drupal container

At this point, our architecture can be illustrated as follows:

We can then specify ‘mysql’ as the MySQL host in the installation wizard page since both containers are linked together. That’s it. To tear them down, simply run the following command under the same directory:

$ docker-compose down

The corresponding containers will be terminated and removed accordingly. Take note that the docker-compose command is bound to the individual physical host running Docker. In order to run on multiple physical hosts across Swarm, it needs to be treated differently by utilizing “docker stack” command. We’ll explain this in the next section.

Composing a Stack Across Swarm

Firstly, make sure the Docker engine is running on v1.13 and Swarm mode is enabled and in ready state:

$ docker node ls
ID                           HOSTNAME       STATUS  AVAILABILITY  MANAGER STATUS
8n8t3r4fvm8u01yhli9522xi9 *  docker1.local  Ready   Active        Reachable
o1dfbbnmhn1qayjry32bpl2by    docker2.local  Ready   Active        Reachable
tng5r9ax0ve855pih1110amv8    docker3.local  Ready   Active        Leader

In order to use the stack feature for Docker Swarm mode, we have to use the Docker Compose version 3 format. We are going to deploy a setup similar to the above, apart from a 3-node Galera setup as the MySQL backend. We already explained in details in this blog post.

Firstly, create a directory for our new stack:

$ mkdir -p ~/compose-files/drupal-galera
$ cd ~/compose-files/drupal-galera

Then add the following lines into “docker-compose.yml”:

version: '3'

services:

  galera:
    deploy:
      replicas: 3
      restart_policy:
        condition: on-failure
        delay: 30s
        max_attempts: 3
        window: 60s
      update_config:
        parallelism: 1
        delay: 10s
        max_failure_ratio: 0.3
    image: severalnines/pxc56
    environment:
      MYSQL_ROOT_PASSWORD: "mypassword"
      CLUSTER_NAME: "my_galera"
      XTRABACKUP_PASSWORD: "mypassword"
      DISCOVERY_SERVICE: '192.168.55.111:2379,192.168.55.112:2379,192.168.55.207:2379'
      MYSQL_DATABASE: 'drupal'
    networks:
      - galera_net

  drupal:
    depends_on:
      - galera
    deploy:
      replicas: 1
    image: drupal
    ports:
      - 8080:80
    volumes:
      - drupal_modules:/var/www/html/modules
      - drupal_profile:/var/www/html/profiles
      - drupal_theme:/var/www/html/themes
      - drupal_sites:/var/www/html/sites
    networks:
      - galera_net

volumes:
  drupal_modules:
  drupal_profile:
  drupal_theme:
  drupal_sites:

networks:
  galera_net:
    driver: overlay

Note that the Galera image that we used (severalnines/pxc56) requires a running etcd cluster installed on each of the Docker physical host. Please refer to this blog post on the prerequisite steps.

One of the important parts in our compose-file is the max_attempts parameter under restart_policy section. We have to specify a hard limit on the number of restarts in case of failure. This will make the deployment process safer because, by default, the Swarm scheduler will never give up in attempting to restart containers. If this happens, the process loop will fill up the physical host’s disk space with unusable containers when the scheduler cannot bring the containers up to the desired state. This is a common approach when handling stateful services like MySQL. It’s better to bring them down altogether rather than make them run in an inconsistent state.

To start them all, just execute the following command in the same directory where docker-compose.yml resides:

$ docker stack deploy --compose-file=docker-compose.yml my_drupal

Verify the stack is created with 2 services (drupal and galera):

$ docker stack ls
NAME       SERVICES
my_drupal  2

We can also list the current tasks in the created stack. The result is a combined version of “docker service ps my_drupal_galera” and “docker service ps my_drupal_drupal” commands:

$ docker stack ps my_drupal
ID            NAME                IMAGE                      NODE           DESIRED STATE  CURRENT STATE           ERROR  PORTS
609jj9ji6rxt  my_drupal_galera.1  severalnines/pxc56:latest  docker3.local  Running        Running 7 minutes ago
z8mcqzf29lbq  my_drupal_drupal.1  drupal:latest              docker1.local  Running        Running 24 minutes ago
skblp9mfbbzi  my_drupal_galera.2  severalnines/pxc56:latest  docker1.local  Running        Running 10 minutes ago
cidn9kb0d62u  my_drupal_galera.3  severalnines/pxc56:latest  docker2.local  Running        Running 7 minutes ago

Once we get the CURRENT STATE as RUNNING, we can start the Drupal installation by connecting to any of the Docker host IP address or hostname on port 8080, as in this case we used docker3 (albeit the drupal container is deployed on docker1), http://192.168.55.113:8080/. Proceed with the installation and specify ‘galera’ as the MySQL host and ‘drupal’ as the database name (as defined in the compose-file under MYSQL_DATABASE environment variable):

That’s it. The stack deployment was simplified by using Compose-file. At this point, our architecture is looking something like this:

Lastly, to remove the stack, just run the following command:

$ docker stack rm my_drupal
Removing service my_drupal_galera
Removing service my_drupal_drupal
Removing network my_drupal_galera_net

Conclusion

Using compose-file can save you time and reduce the risk for human error, as compared to when working with long command lines. This is a perfect tool for you to master before working with multi-container Docker applications, dealing with multiple deployment environments (e.g dev, test, staging, pre-prod, prod) and handling much more complex services, just like MySQL Galera Cluster. Happy containerizing!

Online Schema Upgrade in MySQL Galera Cluster using RSU Method

$
0
0

This post is a continuation of our previous post on Online Schema Upgrade in Galera using TOI method. We will now show you how to perform a schema upgrade using the Rolling Schema Upgrade (RSU) method.

RSU and TOI

As we discussed, when using TOI, a change happens at the same time on all of the nodes. This can become a serious limitation as such way of executing schema changes implies that no other queries can be executed. For long ALTER statements, the cluster may be not available for hours even. Obviously, this is not something you can accept in production. RSU method addresses this weakness - changes happen on one node at a time while other nodes are not affected and can serve traffic. Once ALTER completes on one node, it will rejoin the cluster and you can proceed with executing a schema change on the next node.

Such behavior comes with its own set of limitations. The main one is that scheduled schema change has to be compatible. What does it mean? Let’s think about it for a while. First of all we need to keep in mind that the cluster is up and running all the time - the altered node has to be able to accept all of the traffic which hit the remaining nodes. In short, a DML executed on the old schema has to work also on the new schema (and vice-versa if you use some sort of round-robin-like connection distribution in your Galera Cluster). We will focus on the MySQL compatibility, but you also have to remember that your application has to work with both altered and non-altered nodes - make sure that your alter won’t break the application logic. One good practice is to explicitly pass column names to queries - don’t rely on “SELECT *” because you never know how many columns you’ll get in return.

Galera and Row-based binary log format

Ok, so DML has to work on old and new schemas. How are DML’s transferred between Galera nodes? Does it affect what changes are compatible and what are not? Yes, indeed - it does. Galera does not use regular MySQL replication but it still relies on it to transfer events between the nodes. To be precise, Galera uses ROW format for events. An event in row format (after decoding) may look like this:

### INSERT INTO `schema`.`table`
### SET
###   @1=1
###   @2=1
###   @3='88764053989'
###   @4='14700597838'

Or:

### UPDATE `schema`.`table`
### WHERE
###   @1=1
###   @2=1
###   @3='88764053989'
###   @4='14700597838'
### SET
###   @1=2
###   @2=2
###   @3='88764053989'
###   @4='81084251066'

As you can see, there is a visible pattern: a row is identified by its content. There are no column names, just their order. This alone should turn on some warning lights: “what would happen if I remove one of the columns?” Well, if it is the last column, this is acceptable. If you would remove a column in the middle, this will mess up with the column order and, as a result, replication will break. Similar thing will happen if you add some column in the middle, instead of at the end. There are more constraints, though. Changing column definition will work as long as it is the same data type - you can alter INT column to become BIGINT but you cannot change INT column into VARCHAR - this will break replication. You can find detailed description of what change is compatible and what isn’t in the MySQL documentation. No matter what you can see in the documentation, to stay on the safe side, it’s better to run some tests on a separate development/staging cluster. Make sure it will work not only according to the documentation, but that it also works fine in your particular setup.

All in all, as you can clearly see, performing RSU in a safe way is much more complex than just running couple of commands. Still, as commands are important, let’s take a look at the example of how you can perform the RSU and what can go wrong in the process.

RSU example

Initial setup

Let’s imagine a rather simple example of an application. We will use a bechmark tool, Sysbench, to generate content and traffic, but the flow will be the same for almost every application - Wordpress, Joomla, Drupal, you name it. We will use HAProxy collocated with our application to split reads and writes among Galera nodes in a round-robin fashion. You can check below how HAProxy sees the Galera cluster.

Whole topology looks like below:

Traffic is generated using the following command:

while true ; do sysbench /root/sysbench/src/lua/oltp_read_write.lua --threads=4 --max-requests=0 --time=3600 --mysql-host=10.0.0.100 --mysql-user=sbtest --mysql-password=sbtest --mysql-port=3307 --tables=32 --report-interval=1 --skip-trx=on --table-size=100000 --db-ps-mode=disable run ; done

Schema looks like below:

mysql> SHOW CREATE TABLE sbtest1.sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=29986632 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

First, let’s see how we can add an index to this table. Adding an index is a compatible change which can be easily done using RSU.

mysql> SET SESSION wsrep_OSU_method=RSU;
Query OK, 0 rows affected (0.00 sec)
mysql> ALTER TABLE sbtest1.sbtest1 ADD INDEX idx_new (k, c);
Query OK, 0 rows affected (5 min 19.59 sec)

As you can see in the Node tab, the host on which we executed the change has automatically switched to Donor/Desynced state which ensures that this host will not impact the rest of the cluster if it gets slowed down by the ALTER.

Let’s check how our schema looks now:

mysql> SHOW CREATE TABLE sbtest1.sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`),
  KEY `idx_new` (`k`,`c`)
) ENGINE=InnoDB AUTO_INCREMENT=29986632 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

As you can see, the index has been added. Please keep in mind, though, this happened only on that particular node. To accomplish a full schema change, you have to follow this process on the remaining nodes of the Galera Cluster. To finish with the first node, we can switch wsrep_OSU_method back to TOI:

SET SESSION wsrep_OSU_method=TOI;
Query OK, 0 rows affected (0.00 sec)

We are not going to show the remainder of the process, because it’s the same - enable RSU on the session level, run ALTER, enable TOI. What’s more interesting is what would happen if the change will be incompatible. Let’s take again a quick look at the schema:

mysql> SHOW CREATE TABLE sbtest1.sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`),
  KEY `idx_new` (`k`,`c`)
) ENGINE=InnoDB AUTO_INCREMENT=29986632 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Let’s say we want to change the type of column ‘k’ from INT to VARCHAR(30) on one node.

mysql> SET SESSION wsrep_OSU_method=RSU;
Query OK, 0 rows affected (0.00 sec)
mysql> ALTER TABLE sbtest1.sbtest1 MODIFY COLUMN k VARCHAR(30) NOT NULL DEFAULT '';
Query OK, 10004785 rows affected (1 hour 14 min 51.89 sec)
Records: 10004785  Duplicates: 0  Warnings: 0

Now, lets take a look at the schema:

mysql> SHOW CREATE TABLE sbtest1.sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` varchar(30) NOT NULL DEFAULT '',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`),
  KEY `idx_new` (`k`,`c`)
) ENGINE=InnoDB AUTO_INCREMENT=29986632 DEFAULT CHARSET=latin1
1 row in set (0.02 sec)

Everything is as we expect - ‘k’ column has been changed to VARCHAR. Now we can check if this change is acceptable or not for the Galera Cluster. To test it, we will use one of remaining, unaltered nodes to execute the following query:

mysql> INSERT INTO sbtest1.sbtest1 (k, c, pad) VALUES (123, 'test', 'test');
Query OK, 1 row affected (0.19 sec)
ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Let’s see what happened.

It definitely doesn’t look good - our node is down. Logs will give you more details:

2017-04-07T10:51:14.873524Z 5 [ERROR] Slave SQL: Column 1 of table 'sbtest1.sbtest1' cannot be converted from type 'int' to type 'varchar(30)', Error_code: 1677
2017-04-07T10:51:14.873560Z 5 [Warning] WSREP: RBR event 3 Write_rows apply warning: 3, 982675
2017-04-07T10:51:14.879120Z 5 [Warning] WSREP: Failed to apply app buffer: seqno: 982675, status: 1
         at galera/src/trx_handle.cpp:apply():351
Retrying 2th time
2017-04-07T10:51:14.879272Z 5 [ERROR] Slave SQL: Column 1 of table 'sbtest1.sbtest1' cannot be converted from type 'int' to type 'varchar(30)', Error_code: 1677
2017-04-07T10:51:14.879287Z 5 [Warning] WSREP: RBR event 3 Write_rows apply warning: 3, 982675
2017-04-07T10:51:14.879399Z 5 [Warning] WSREP: Failed to apply app buffer: seqno: 982675, status: 1
         at galera/src/trx_handle.cpp:apply():351
Retrying 3th time
2017-04-07T10:51:14.879618Z 5 [ERROR] Slave SQL: Column 1 of table 'sbtest1.sbtest1' cannot be converted from type 'int' to type 'varchar(30)', Error_code: 1677
2017-04-07T10:51:14.879633Z 5 [Warning] WSREP: RBR event 3 Write_rows apply warning: 3, 982675
2017-04-07T10:51:14.879730Z 5 [Warning] WSREP: Failed to apply app buffer: seqno: 982675, status: 1
         at galera/src/trx_handle.cpp:apply():351
Retrying 4th time
2017-04-07T10:51:14.879911Z 5 [ERROR] Slave SQL: Column 1 of table 'sbtest1.sbtest1' cannot be converted from type 'int' to type 'varchar(30)', Error_code: 1677
2017-04-07T10:51:14.879924Z 5 [Warning] WSREP: RBR event 3 Write_rows apply warning: 3, 982675
2017-04-07T10:51:14.885255Z 5 [ERROR] WSREP: Failed to apply trx: source: 938415a6-1aab-11e7-ac29-0a69a4a1dafe version: 3 local: 0 state: APPLYING flags: 1 conn_id: 125559 trx_id: 2856843 seqnos (l: 392283, g: 9
82675, s: 982674, d: 982563, ts: 146831275805149)
2017-04-07T10:51:14.885271Z 5 [ERROR] WSREP: Failed to apply trx 982675 4 times
2017-04-07T10:51:14.885281Z 5 [ERROR] WSREP: Node consistency compromized, aborting…

As can be seen, Galera complained about the fact that the column cannot be converted from INT to VARCHAR(30). It attempted to re-execute the writeset four times but it failed, unsurprisingly. As such, Galera determined that the node consistency is compromised and the node is kicked out of the cluster. Remaining content of the logs shows this process:

2017-04-07T10:51:14.885560Z 5 [Note] WSREP: Closing send monitor...
2017-04-07T10:51:14.885630Z 5 [Note] WSREP: Closed send monitor.
2017-04-07T10:51:14.885644Z 5 [Note] WSREP: gcomm: terminating thread
2017-04-07T10:51:14.885828Z 5 [Note] WSREP: gcomm: joining thread
2017-04-07T10:51:14.885842Z 5 [Note] WSREP: gcomm: closing backend
2017-04-07T10:51:14.896654Z 5 [Note] WSREP: view(view_id(NON_PRIM,6fcd492a,37) memb {
        b13499a8,0
} joined {
} left {
} partitioned {
        6fcd492a,0
        938415a6,0
})
2017-04-07T10:51:14.896746Z 5 [Note] WSREP: view((empty))
2017-04-07T10:51:14.901477Z 5 [Note] WSREP: gcomm: closed
2017-04-07T10:51:14.901512Z 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2017-04-07T10:51:14.901531Z 0 [Note] WSREP: Flow-control interval: [16, 16]
2017-04-07T10:51:14.901541Z 0 [Note] WSREP: Received NON-PRIMARY.
2017-04-07T10:51:14.901550Z 0 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 982675)
2017-04-07T10:51:14.901563Z 0 [Note] WSREP: Received self-leave message.
2017-04-07T10:51:14.901573Z 0 [Note] WSREP: Flow-control interval: [0, 0]
2017-04-07T10:51:14.901581Z 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2017-04-07T10:51:14.901589Z 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 982675)
2017-04-07T10:51:14.901602Z 0 [Note] WSREP: RECV thread exiting 0: Success
2017-04-07T10:51:14.902701Z 5 [Note] WSREP: recv_thread() joined.
2017-04-07T10:51:14.902720Z 5 [Note] WSREP: Closing replication queue.
2017-04-07T10:51:14.902730Z 5 [Note] WSREP: Closing slave action queue.
2017-04-07T10:51:14.902742Z 5 [Note] WSREP: /usr/sbin/mysqld: Terminated.

Of course, ClusterControl will attempt to recover such node - recovery involves running SST so incompatible schema changes will be removed, but we will be back at the square one - our schema change will be reversed.

As you can see, while running RSU is a very simple process, underneath it can be rather complex. It requires some tests and preparations to make sure that you won’t lose a node just because the schema change was not compatible.

Video: ClusterControl for Galera Cluster

$
0
0

This video walks you through the features that ClusterControl offers for Galera Cluster and how you can use them to deploy, manage, monitor and scale your open source database environments.

ClusterControl and Galera Cluster

ClusterControl provides advanced deployment, management, monitoring, and scaling functionality to get your Galera clusters up-and-running using proven methodologies that you can depend on to work.

At the core of ClusterControl is it’s automation functionality that let’s you automate many of the database tasks you have to perform regularly like deploying new clusters, adding and scaling new nodes, running backups and upgrades, and more.

ClusterControl is cluster-aware, topology-aware and able to provision all members of the Galera Cluster, including the replication chain connected to it.

To learn more check out the following resources…

MySQL on Docker: ClusterControl and Galera Cluster on Docker Swarm

$
0
0

Our journey in adopting MySQL and MariaDB in containerized environments continues, with ClusterControl coming into the picture to facilitate deployment and management. We already have our ClusterControl image hosted in Docker Hub, where it can deploy different replication/cluster topologies on multiple containers. With the introduction of Docker Swarm, a native orchestration tools embedded inside Docker Engine, scaling and provisioning containers has become much easier. It also has high availability covered by running services on multiple Docker hosts.

In this blog post, we’ll be experimenting with automatic provisioning of Galera Cluster on Docker Swarm with ClusterControl. ClusterControl would usually deploy database clusters on bare-metal, virtual machines and cloud instances. ClusterControl relies on SSH (through libssh) as core communication module to connect to the managed hosts, so these would not require any agents. The same rule can be applied to containers, and that’s what we are going to show in this blog post.

ClusterControl as Docker Swarm Service

We have built a Docker image with extended logic to handle deployment in container environments in a semi-automatic way. The image is now available on Docker Hub and the code is hosted in our Github repository. Please note that only this image is capable of deploying on containers, and is not available in the standard ClusterControl installation packages.

The extended logic is inside deploy-container.sh, a script that monitors a custom table inside CMON database called “cmon.containers”. The created database container shall report and register itself into this table and this script will look for new entries and perform the necessary action using a ClusterControl CLI. The deployment is automatic, and you can monitor the progress directly from the ClusterControl UI or using “docker logs” command.

Before we go further, take note of some prerequisites for running ClusterControl and Galera Cluster on Docker Swarm:

  • Docker Engine version 1.12 and later.
  • Docker Swarm Mode is initialized.
  • ClusterControl must be connected to the same overlay network as the database containers.

To run ClusterControl as a service using “docker stack”, the following definition should be enough:

 clustercontrol:
    deploy:
      replicas: 1
    image: severalnines/clustercontrol
    ports:
      - 5000:80
    networks:
      - galera_cc

Or, you can use the “docker service” command as per below:

$ docker service create --name cc_clustercontrol -p 5000:80 --replicas 1 severalnines/clustercontrol

Or, you can combine the ClusterControl service together with the database container service and form a “stack” in a compose file as shown in the next section.

Base Containers as Docker Swarm Service

The base container’s image called “centos-ssh” is based on CentOS 6 image. It comes with a couple of basic packages like SSH server, clients, curl and mysql client. The entrypoint script will download ClusterControl’s public key for passwordless SSH during startup. It will also register itself to the ClusterControl’s CMON database for automatic deployment.

Running this container requires a couple of environment variables to be set:

  • CC_HOST - Mandatory. By default it will try to connect to “cc_clustercontrol” service name. Otherwise, define its value in IP address, hostname or service name format. This container will download the SSH public key from ClusterControl node automatically for passwordless SSH.
  • CLUSTER_TYPE - Mandatory. Default to “galera”.
  • CLUSTER_NAME - Mandatory. This name distinguishes the cluster with others from ClusterControl perspective. No space allowed and it must be unique.
  • VENDOR - Default is “percona”. Other supported values are “mariadb”, “codership”.
  • DB_ROOT_PASSWORD - Mandatory. The database root password for the database server. In this case, it should be MySQL root password.
  • PROVIDER_VERSION - Default is 5.6. The database version by the chosen vendor.
  • INITIAL_CLUSTER_SIZE - Default is 3. This indicates how ClusterControl should treat newly registered containers, whether they are for new deployments or for scaling out. For example, if the value is 3, ClusterControl will wait for 3 containers to be running and registered into the CMON database before starting the cluster deployment job. Otherwise, it waits 30 seconds for the next cycle and retries. The next containers (4th, 5th and Nth) will fall under the “Add Node” job instead.

To run the container, simply use the following stack definition in a compose file:

  galera:
    deploy:
      replicas: 3
    image: severalnines/centos-ssh
    ports:
      - 3306:3306
    environment:
      CLUSTER_TYPE: "galera"
      CLUSTER_NAME: "PXC_Docker"
      INITIAL_CLUSTER_SIZE: 3
      DB_ROOT_PASSWORD: "mypassword123"
    networks:
      - galera_cc

By combining them both (ClusterControl and database base containers), we can just deploy them under a single stack as per below:

version: '3'

services:

  galera:
    deploy:
      replicas: 3
      restart_policy:
        condition: on-failure
        delay: 10s
    image: severalnines/centos-ssh
    ports:
      - 3306:3306
    environment:
      CLUSTER_TYPE: "galera"
      CLUSTER_NAME: "Galera_Docker"
      INITIAL_CLUSTER_SIZE: 3
      DB_ROOT_PASSWORD: "mypassword123"
    networks:
      - galera_cc

  clustercontrol:
    deploy:
      replicas: 1
    image: severalnines/clustercontrol
    ports:
      - 5000:80
    networks:
      - galera_cc

networks:
  galera_cc:
    driver: overlay

Save the above lines into a file, for example docker-compose.yml in the current directory. Then, start the deployment:

$ docker stack deploy --compose-file=docker-compose.yml cc
Creating network cc_galera_cc
Creating service cc_clustercontrol
Creating service cc_galera

Docker Swarm will deploy one container for ClusterControl (replicas:1) and another 3 containers for the database cluster containers (replicas:3). The database container will then register itself into the CMON database for deployment.

Wait for a Galera Cluster to be ready

The deployment will be automatically picked up by the ClusterControl CLI. So you basically don’t have to do anything but wait. The deployment usually takes around 10 to 20 minutes depending on the network connection.

Open the ClusterControl UI at http://{any_Docker_host}:5000/clustercontrol, fill in the default administrator user details and log in. Monitor the deployment progress under Activity -> Jobs, as shown in the following screenshot:

Or, you can look at the progress directly from the docker logs command of the ClusterControl container:

$ docker logs -f $(docker ps | grep clustercontrol | awk {'print $1'})>> Found the following cluster(s) is yet to deploy:
Galera_Docker>> Number of containers for Galera_Docker is lower than its initial size (3).>> Nothing to do. Will check again on the next loop.>> Found the following cluster(s) is yet to deploy:
Galera_Docker>> Found a new set of containers awaiting for deployment. Sending deployment command to CMON.>> Cluster name         : Galera_Docker>> Cluster type         : galera>> Vendor               : percona>> Provider version     : 5.7>> Nodes discovered     : 10.0.0.6 10.0.0.7 10.0.0.5>> Initial cluster size : 3>> Nodes to deploy      : 10.0.0.6;10.0.0.7;10.0.0.5>> Deploying Galera_Docker.. It's gonna take some time..>> You shall see a progress bar in a moment. You can also monitor>> the progress under Activity (top menu) on ClusterControl UI.
Create Galera Cluster
- Job  1 RUNNING    [██▊       ]  26% Installing MySQL on 10.0.0.6

That’s it. Wait until the deployment completes and you will then be all set with a three-node Galera Cluster running on Docker Swarm, as shown in the following screenshot:

In ClusterControl, it has the same look and feel as what you have seen with Galera running on standard hosts (non-container) environment.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Management

Managing database containers is a bit different with Docker Swarm. This section provides an overview of how the database containers should be managed through ClusterControl.

Connecting to the Cluster

To verify the status of the replicas and service name, run the following command:

$ docker service ls
ID            NAME               MODE        REPLICAS  IMAGE
eb1izph5stt5  cc_clustercontrol  replicated  1/1       severalnines/clustercontrol:latest
ref1gbgne6my  cc_galera          replicated  3/3       severalnines/centos-ssh:latest

If the application/client is running on the same Swarm network space, you can connect to it directly via the service name endpoint. If not, use the routing mesh by connecting to the published port (3306) on any of the Docker Swarm nodes. The connection to these endpoints will be load balanced automatically by Docker Swarm in a round-robin fashion.

Scale up/down

Typically, when adding a new database, we need to prepare a new host with base operating system together with passwordless SSH. In Docker Swarm, you just need to scale out the service using the following command to the number of replicas that you desire:

$ docker service scale cc_galera=5
cc_galera scaled to 5

ClusterControl will then pick up the new containers registered inside cmon.containers table and trigger add node jobs, for one container at a time. You can look at the progress under Activity -> Jobs:

Scaling down is similar, by using the “service scale” command. However, ClusterControl doesn’t know whether the containers that have been removed by Docker Swarm were part of the auto-scheduling or just a scale down (which indicates that we deliberately wanted the containers to be removed). Thus, to scale down from 5 nodes to 3 nodes, one would:

$ docker service scale cc_galera=3
cc_galera scaled to 3

Then, remove the stopped hosts from the ClusterControl UI by going to Nodes -> rollover the removed container -> click on the ‘X’ icon on the top right -> Confirm & Remove Node:

ClusterControl will then execute a remove node job and bring back the cluster to the expected size.

Failover

In case of container failure, Docker Swarm automatic rescheduling will kick in and there will be a new replacement container with the same IP address as the old one (with different container ID). ClusterControl will then start to provision this node from scratch, by performing the installation process, configuration and getting it to rejoin the cluster. The old container will be removed automatically from ClusterControl before the deployment starts.

Go ahead and try to kill one of the database containers:

$ docker kill [container ID]

You’ll see the new containers that Swarm created will be provisioned automatically by ClusterControl.

Creating a new cluster

To create a new cluster, just create another service or stack with a different CLUSTER_NAME and service name. The following is an example that we want to create another Galera Cluster running on MariaDB 10.1 (some extra environment variables are required for MariaDB 10.1):

version: '3'
services:
  galera2:
    deploy:
      replicas: 3
    image: severalnines/centos-ssh
    ports:
      - 3306
    environment:
      CLUSTER_TYPE: "galera"
      CLUSTER_NAME: "MariaDB_Galera"
      VENDOR: "mariadb"
      PROVIDER_VERSION: "10.1"
      INITIAL_CLUSTER_SIZE: 3
      DB_ROOT_PASSWORD: "mypassword123"
    networks:
      - cc_galera_cc

networks:
  cc_galera_cc:
    external: true

Then, create the service:

$ docker stack deploy --compose-file=docker-compose.yml db2

Go back to ClusterControl UI -> Activity -> Jobs and you should see a new deployment has started. After a couple of minutes, you will see the new cluster will be listed inside ClusterControl dashboard:

Destroying everything

To remove everything (including the ClusterControl container), you just need to remove the stack created by Docker Swarm:

$ docker stack rm cc
Removing service cc_clustercontrol
Removing service cc_galera
Removing network cc_galera_cc

That’s it, the whole stack has been removed. Pretty neat huh? You can start all over again by running the “docker stack deploy” command and everything will be ready after a couple of minutes.

Summary

The flexibility you get by running a single command to deploy or destroy a whole environment can be useful in different types of use cases such as backup verification, DDL procedure testing, query performance tweaking, experimenting for proof-of-concepts and also for staging temporary data. These use cases are closer to developer environment. With this approach, you can now treat a stateful service “statelessly”.

Would you like to see ClusterControl manage the whole database container stack through the UI via point and click? Let us know your thoughts in the comments section below. In the next blog post, we are going to look at how to perform automatic backup verification on Galera Cluster using containers.


How Galera Cluster Enables High Availability for High Traffic Websites

$
0
0

In today’s competitive technology environment, high availability is a must. There is no way around it - if your website or service is not available, then most probably you are losing money. It could relate directly to money loss - your customers cannot access your e-commerce service and they cannot spend their money (in addition, they are likely to use your competitors instead). Or it can be less directly linked - your sales reps cannot reach your web-based CRM system and that seriously limits their productivity. No matter how, a website which cannot be reached can be more or less serious for any organization. The question is - how do we ensure that your website will stay available? Assuming you are using MySQL or MariaDB (not unlikely, if it is a website), one of the technologies that can be utilized is Galera Cluster. In this blog post, we’ll show you how to leverage a high availability database to improve the availability of your site.

High availability is hard with databases

Every website is different, but in general, we’ll see some frontend webservers, a database backend, load balancers, file system storage and additional components like caching systems. To make a website highly available, we would need each component to be highly available so we do not have any single point of failure (SPOF).

The webserver tier is usually relatively easy to scale, as it is possible to deploy multiple instances behind a load balancer. If you would want to preserve session state across all webservers, one would probably store it in a shared database (Memcached, or even MySQL Cluster for a pretty robust alternative). Load balancers can be made highly available (e.g. HAProxy/Keepalived/VIP). There are a number of clustered file systems that can be used for the storage layer, we have previously covered solutions like csync2 with lsyncd, GlusterFS, OCFS2. The database service can be made highly available with e.g. DRBD, so all storage is replicated to a standby server. This means the service can be started on another host that has access to the database files.

This might not work very well for a high traffic website though, as all webservers will still be hitting the primary database instance. Failover time also can take a while, since you are failing over to a cold standby server and MySQL has to perform crash recovery when starting up.

MySQL master-slave replication is another option, and there are different ways to make it highly available. There are inconveniences though, as not all nodes are the same - you need to ensure to just write to one master and avoid diverging datasets across the nodes.

Galera Cluster for MySQL/MariaDB

Let’s start with discussing what Galera Cluster is and what it is not. It is a virtually synchronous, multi-master cluster. You can access any of the nodes and issue reads and writes - this is a significant improvement compared to replication setups - no need for failovers and master promotions, if one node is down, usually you can just connect to another node and execute queries. Galera provides a self-healing mechanism through state transfers - State Snapshot Transfer (SST) and Incremental State Transfer (IST). What it means is that when a node joins a cluster, Galera will attempt to bring it back to sync. It may just copy missing data from other node’s gcache (IST) or, if none of nodes contain data in its gcache, it will copy all of the contents of one of nodes to the joining node (via SST). This also makes it very easy for a Galera Cluster to recover even from serious failure conditions. Galera Cluster is a method to scale reads - you can increase a size of the cluster and you can read from all of the slaves.

On the other hand, you have to keep in mind what Galera is not. Galera is not a solution which implements sharding (like NDB Cluster does) - it does not shard the data automatically, each and every Galera node contains the same full data set, just like a standalone MySQL node. Therefore, Galera is not a solution which can help you scale writes. It can help you squeeze more writes than standard, single-threaded replication as it can utilize multiple writers at once, but as of MySQL 5.7, you can use multithreaded replication for every workload so it’s not the same advantage it used to be. Galera is not a solution which can be left alone - even though it has some auto healing features, it still requires user supervision and it happens pretty often that Galera cannot recover on its own.

Having said that, Galera Cluster is still a great piece of software, which can be utilized to build highly available clusters. In the next section we’ll show you different deployment patterns for Galera. Before we get there, there is one more very important bit of information that is required to understand why we want to deploy Galera clusters the way we are about to describe - quorum calculations. Galera has a mechanism which prevents split brain scenarios from happening. It detects number of available nodes and checks if there is a quorum available - Galera has to see (50% + 1) nodes to accept traffic. Otherwise it assumes that a split brain happened and it is a part of a minority segment which does not contain current data nor it can accept writes. Ok, now let’s talk about different ways in which you can deploy Galera cluster.

Deploying Galera cluster

Basic deployment - three node cluster

The most common way in which Galera cluster is deployed is to use an odd number of nodes and just deploy them. The most basic setup for Galera is a three node cluster. This setup is enough to survive a loss of one node - it still can operate just fine even though its read capacity decreases and it cannot tolerate any more failures. Such setup is very common to be used as an entry level - you can always add more nodes in the future, keeping in mind that you should use an odd number of nodes. We have seen Galera clusters as big as 11 - 13 nodes.

Minimalistic approach - two nodes + garbd

If you want to do some testing with Galera, you can reduce the cluster size to two nodes. A 2-node cluster does not provide any fault tolerance but we can improve this by leveraging Galera Arbitrator (garbd) - a daemon which can be started on a third node. It will receive all of the Galera traffic and for the purpose of detecting failures and forming a quorum, it acts as a Galera node. Given that garbd doesn’t need to apply writesets (it just accepts them, no further action is taken), it doesn’t require beefy hardware, like database nodes would need. This reduce the cost of hardware that’s needed to build the cluster.

One step further - add asynchronous slave

At any point you can extend your Galera cluster by adding one or more asynchronous slaves. Ideally, your Galera cluster has GTID enabled - it makes adding and reslaving slaves so much easier. Even if not, it is still possible to create a slave, although replication is much less likely to survive a crash of the “master” galera node. Such asynchronous slave can be used for a variety of reasons. It can be used as a backup host - run your backups on it to minimize the impact on Galera cluster. It can also be used for heavier, OLAP queries - again, to remove load from the Galera cluster. Another reason why you may want to use asynchronous slave is to build a Disaster Recovery environment - set it up in a separate datacenter and, in case the location of your Galera Cluster would go up in flames, you will still have a copy of your data in a safe location.

Multi-datacenter Galera clusters

If you really care about availability of your data, you can improve it by spanning your Galera cluster across multiple datacenters. Galera can be used across the WAN - some reconfiguration may be needed to make it better adapted to higher latency of WAN connections but it is totally suitable to work in such environment. The only blocker would be if your application frequently modifies a very small subset of rows - this could significantly reduce number of queries per second it will be able to execute.

When talking about WAN-spanning Galera clusters it is important to mention segments. Segments are used in Galera to differentiate the nodes of the cluster which are collocated in the same DC. For example, you may want to configure all nodes in the datacenter “A” to use segment 1 and all nodes in the datacenter “B” to use segment 2. We won’t go into details here (we covered this bit extensively in one of our posts) but in short, using segments reduce inter-segment communication - something you definitely want if segments are connected using WAN.

Another important aspect of building a highly available Galera Cluster over multiple datacenters is to use an odd number of datacenters. Two is not enough to build a setup which would automatically tolerate a loss of a datacenter. Let’s analyze following setup.

As we mentioned earlier, Galera requires a quorum to operate. Let’s see what would happen if one datacenter is not available:

As you can clearly see, we ended up with 3 nodes up, so 50% only - not enough to form a quorum. Manual action is required to assess the situation and promote the remaining part of the cluster to form a “Primary Component” - this requires time and it prolongs downtime.

Let’s take a look at another option:

In this case we added one more node to the datacenter “B” to make sure it will take over the traffic when DC “A” is down. But what would happen if DC “B” goes down?

We have three nodes out of seven, less than 50%. The only way is to use a third datacenter. You can, of course, use one more segment of three nodes (or more, it’s important to have the same number of nodes in each datacenter) but you can minimize costs by utilizing Galera Arbitrator:

In this case, no matter which datacenter will stop operating, as long as it will be only one of them, with garbd we have a quorum:

In our example it is: seven nodes in total, three down, four (3 + garbd) up - enough to form a quorum.

Multiple Galera clusters connected using asynchronous replication

We mentioned that you can deploy an asynchronous slave to Galera cluster and use it, for example, as a DR host. You also have to keep in mind that Galera cluster, to some extent, can be treated as a single MySQL instance. This means there’s no reason why you couldn’t connect two separate Galera clusters using asynchronous replication. Such setup is pretty common. Again, as with regular slaves, it’s better to have GTID because it allows you to quickly reslave your “standby” Galera cluster to another Galera node in the “active” cluster. Without GTID this is also possible but it’s so much more time-consuming and error-prone. Of course, such setup does work together as the multi-DC Galera cluster would - there’s no automated recovery of the replication link between datacenters, there’s no automated reslaving if a “master” Galera node goes down. You may need to build your own tools to automate this process.

Proxy layer

There is no high availability without a proxy layer (unless you have built-in HA in your application). Static connections to a single host does not scale. What you need is a middleman - something that will sit between your application and your database tier and mask the complexity of your database setup. Ideally, your application will have just a single point of access to your databases - connect to a given host and port - that’s it.

You can choose between different proxies - HAProxy, MaxScale, ProxySQL - all of them can work with Galera Cluster. Some of them, though, may require additional configuration - you need to know what and how it needs to be done. Additionally, it’s extremely important so you won’t end up with a proxy as a single point of failure. Each of those proxies can be deployed in a highly available fashion, and you will need to have a few components work together.

How ClusterControl can help you to build highly available Galera Clusters?

ClusterControl can help deploy Galera using all the vendors that are available: Codership, Percona XtraDB Cluster and MariaDB Cluster.

When you deployed a cluster, you can easily scale it up. If needed, you can define different segments for WAN-spanning cluster.

You can also, if you want, deploy an asynchronous slave to the Galera Cluster.

You can use ClusterControl to deploy Galera Arbitrator, which can be very helpful (as we shown previously) in multi-datacenter deployments.

Proxy layer

ClusterControl gives you ability to deploy different proxies with your Galera Cluster. It support deployments of HAProxy, MaxScale and ProxySQL. For HAProxy and ProxySQL, there are additional options to deploy redundant instances with Keepalived and VirtualIP.

For HAProxy, ClusterControl has a statistics page. You can also set a node to maintenance state:

For MaxScale, using ClusterControl, you have access to MaxScale’s CLI and you can perform any actions that are possible from it. Please note we deploy MaxScale version 1.4.3 - more recent versions introduced licensing limitations.

ClusterControl provides quite a bit of functionality for managing ProxySQL, including creating query rules and query caching, including creating query rules and query caching. If you are interested in more details, we encourage you to watch the replay of one of our webinars in which we demoed this UI.

As mentioned previously, ClusterControl can also deploy HAProxy and ProxySQL in a highly available setup. We use virtual IP and Keepalived to track the state of services and perform a failover if needed.

As we have seen in this blog, Galera Cluster can be used in a number of ways to provide high availability of your database, which is a key part of the web infrastructure. You are welcome to download ClusterControl and try the different topologies we discussed above.

ClusterControl for Galera Cluster for MySQL

$
0
0

ClusterControl allows you to easily manage your database infrastructure on premise or in the cloud. With in-depth support for technologies like Galera Cluster for MySQL and MariaDB setups, you can truly automate mixed environments for next-level applications.

Since the launch of ClusterControl in 2012, we’ve experienced growth in new industries with customers who are benefiting from the advancements ClusterControl has to offer - in particular when it comes to Galera Cluster for MySQL.

In addition to reaching new highs in ClusterControl demand, this past year we’ve doubled the size of our team allowing us to continue to provide even more improvements to ClusterControl.

Take a look at this infographic for our top Galera Cluster for MySQL resources and information about how ClusterControl works with Galera Cluster.

How to use s9s -The Command Line Interface to ClusterControl

$
0
0

s9s is our official command line tool to interact with ClusterControl. We’re pretty excited about this, as we believe its ease of use and script-ability will make our users even more productive. Let’s have a look at how to use it. In this blog, we’ll show you how to use s9s to deploy and manage your database clusters.

ClusterControl v1.4.1 comes with an optional package called s9s-tools, which contains a binary called "s9s". As most of you already know, ClusterControl provides a graphical user interface from where you can deploy, monitor and manage your databases. The GUI interacts with the ClusterControl Controller (cmon) via an RPC interface. The new CLI client is another way to interact with the RPC interface, by using a collection of command line options and arguments. At the time of writing, the CLI has support for a big chunk of the ClusterControl functionality and the plan is to continue to build it out. Please refer to ClusterControl CLI documentation page for more details. It is worth mentioning that the CLI is open source, so it is possible for anybody to add functionality to it.

As a side note, s9s is the backbone that drives the automatic deployment when running ClusterControl and Galera Cluster on Docker Swarm as shown in this blog post. You should take note that this tool is relatively new and comes with some limitations like different user management module and no support for Role Based Access Control.

Setting up the Client

Starting from version 1.4.1, the installer script will automatically install this package on the ClusterControl node. You can also install it on another computer or workstation to manage the database cluster remotely. All communication is encrypted and secure through SSH.

In this example, the client will be installed on another workstation running on Ubuntu. We are going to connect to the ClusterControl server remotely. Here is the diagram to illustrate this:

We have covered the installation and configuration steps in the documentation. Ensure you perform the following steps:

  1. On the ClusterControl host, ensure it runs on ClusterControl Controller 1.4.1 and later.
  2. On the ClusterControl host, ensure CMON RPC interface (port 9500 and 9501) is listening to an IP address that is routable to external network. Follow these steps.
  3. Install s9s-tools package on the workstation. Follow these installation steps.
  4. Configure the Remote Access. Follow these steps.

Take note that it is also possible to build the s9s command line client on other Linux distribution and Mac OS/X as described here. The command line client installs manual pages and can be viewed by entering the command:

$ man s9s

Deploy Everything through CLI

In this example, we are going to perform the following operations with the CLI:

  1. Deploy a three-node Galera Cluster
  2. Monitor state and process
  3. Create schema and user
  4. Take backups
  5. Cluster and node operations
  6. Scaling up/down

Deploy a three-node Galera Cluster

First, on the ClusterControl host, ensure we have setup passwordless SSH to all the target hosts:

(root@clustercontrol)$ ssh-copy-id 10.0.0.3
(root@clustercontrol)$ ssh-copy-id 10.0.0.4
(root@clustercontrol)$ ssh-copy-id 10.0.0.5

Then, from the client workstation:

(client)$ s9s cluster --create --cluster-type=galera --nodes="10.0.0.3;10.0.0.4;10.0.0.5"  --vendor=percona --provider-version=5.7 --db-admin-passwd="mySecr3t" --os-user=root --cluster-name="PXC_Cluster_57" --wait

We defined “--wait” which means the job will run in the foreground and wait for the job to complete. It will return 0 for a successful job or non-zero if the job fails. To let this process run as a background job, just omit this flag.

Then, you should see the progress bar:

Create Galera Cluster
\ Job  1 RUNNING    [▊       ]  26% Installing MySQL on 10.0.0.3

The same progress can be monitored under Activity (top menu) of the ClusterControl UI:

Notice that the job was initiated by user 'dba', which is our command line remote user.

Monitor state and process

There are several ways to look for the cluster. You can simply list out the cluster with --list:

$ s9s cluster --list --long
ID STATE   TYPE   OWNER GROUP NAME           COMMENT
 1 STARTED galera dba   users PXC_Cluster_57 All nodes are operational.
Total: 1

Also, there is another flag called --stat for a more detailed summary:

$ s9s cluster --stat
    Name: PXC_Cluster_57                      Owner: dba/users
      ID: 1                                   State: STARTED
    Type: GALERA                             Vendor: percona 5.7
  Status: All nodes are operational.
  Alarms:  0 crit   1 warn
    Jobs:  0 abort  0 defnd  0 dequd  0 faild  2 finsd  0 runng
  Config: '/etc/cmon.d/cmon_1.cnf'

Take note that you can use cluster ID or name value as the identifier when manipulating our Galera Cluster. More examples further down.

To get an overview of all nodes, we can simply use the “node” command option:

$ s9s node --list --long
ST  VERSION         CID CLUSTER        HOST      PORT COMMENT
go- 5.7.17-13-57      1 PXC_Cluster_57 10.0.0.3  3306 Up and running
go- 5.7.17-13-57      1 PXC_Cluster_57 10.0.0.4  3306 Up and running
go- 5.7.17-13-57      1 PXC_Cluster_57 10.0.0.5  3306 Up and running
co- 1.4.1.1834        1 PXC_Cluster_57 10.0.0.7  9500 Up and running
go- 10.1.23-MariaDB   2 MariaDB_10.1   10.0.0.10 3306 Up and running
go- 10.1.23-MariaDB   2 MariaDB_10.1   10.0.0.11 3306 Up and running
co- 1.4.1.1834        2 MariaDB_10.1   10.0.0.7  9500 Up and running
gr- 10.1.23-MariaDB   2 MariaDB_10.1   10.0.0.9  3306 Failed
Total: 8

s9s allows you to have an aggregated view of all processes running on all nodes. It can be represented in a real-time format (similar to ‘top’ output) or one-time format (similar to ‘ps’ output). To monitor live processes, you can do:

$ s9s process --top --cluster-id=1
PXC_Cluster_57 - 04:27:12                                           All nodes are operational.
4 hosts, 16 cores,  6.3 us,  0.7 sy, 93.0 id,  0.0 wa,  0.0 st,
GiB Mem : 14.8 total, 2.9 free, 4.6 used, 0.0 buffers, 7.2 cached
GiB Swap: 7 total, 0 used, 7 free,

PID   USER   HOST     PR  VIRT      RES    S   %CPU   %MEM COMMAND
 4623 dba    10.0.0.5 20  5700352   434852 S  20.27  11.24 mysqld
 4772 dba    10.0.0.4 20  5634564   430864 S  19.99  11.14 mysqld
 2061 dba    10.0.0.3 20  5780688   459160 S  19.91  11.87 mysqld
  602 root   10.0.0.7 20  2331624    38636 S   8.26   1.00 cmon
  509 mysql  10.0.0.7 20  2613836   163124 S   0.66   4.22 mysqld
...

Similar to top command, you can get a real-time summary for all nodes under this cluster. The first line tells us the cluster name, current time and the cluster state. The second, third and fourth lines are accumulated resources of all nodes in the cluster combined. In this example, we used 4 hosts (1 ClusterControl + 3 Galera), each has 4 cores, ~4 GB RAM and around 2GB swap.

To list out processes (similar to ps output) of all nodes for cluster ID 1, you can do:

$ s9s process --list --cluster-id=1
PID   USER   HOST     PR  VIRT      RES    S   %CPU   %MEM COMMAND
 2061 dba    10.0.0.3 20  5780688   459160 S  25.03  11.87 mysqld
 4623 dba    10.0.0.5 20  5700352   434852 S  23.87  11.24 mysqld
 4772 dba    10.0.0.4 20  5634564   430864 S  20.86  11.14 mysqld
  602 root   10.0.0.7 20  2331624    42436 S   8.73   1.10 cmon
  509 mysql  10.0.0.7 20  2613836   173688 S   0.66   4.49 mysqld
...

You can see there is another column called “HOST”, which represents the host where the process is running. This centralized view approach will surely save you time, so you do not have to get individual outputs for each node and then compare them.

Create database schema and user

Now we know our cluster is ready and healthy. We can then create a database schema:

$ s9s cluster --create-database --cluster-name='PXC_Cluster_57' --db-name=db1
Database 'db1' created.

The command below does the same thing but using the cluster ID as the cluster identifier instead:

$ s9s cluster --create-database --cluster-id=1 --db-name=db2
Database 'db2' created.

Then, create a database user associated with this database together with proper privileges:

$ s9s cluster --create-account --cluster-name='PXC_Cluster_57' --account='userdb1:password@10.0.0.%' --privileges='db1.*:SELECT,INSERT,UPDATE,DELETE,INDEX,CREATE'
Account 'userdb1' created.
Grant processed: GRANT SELECT, INSERT, UPDATE, DELETE, INDEX, CREATE ON db1.* TO 'userdb1'@'10.0.0.%'

You can now import or start to work with the database.

Take backups with mysqldump and Xtrabackup

Creating a backup is simple. You just need to decide which node to backup and the backup method. The storage location by default will be located on the controller node, unless you specify the --on-node flag. If the backup destination directory does not exist, ClusterControl will create it for you.

Backup completion time varies depending on the database size. It’s good to let the backup job run in the background:

$ s9s backup --create --backup-method=mysqldump --cluster-id=1 --nodes=10.0.0.5:3306 --backup-directory=/storage/backups
Job with ID 4 registered.

The ID for the backup job is 4. We can list all jobs by simply listing them out:

$ s9s job --list
ID CID STATE    OWNER  GROUP  CREATED  RDY  COMMENT
 1   0 FINISHED dba    users  06:19:33 100% Create Galera Cluster
 2   1 FINISHED system admins 06:33:48 100% Galera Node Recovery
 3   1 FINISHED system admins 06:36:04 100% Galera Node Recovery
 4   1 RUNNING3 dba    users  07:21:30   0% Create Backup
Total: 4

The job list tells us that there is a running job with state RUNNING3 (job is running on thread #3). You can then attach to this job if you would like to monitor the progress:

$ s9s job --wait --job-id=4
Create Backup
\ Job  4 RUNNING3   [    █     ] ---% Job is running

Or, inspect the job messages using the --log flag:

$ s9s job --log --job-id=4
10.0.0.5:3306: Preparing for backup - host state (MYSQL_OK) is acceptable.
10.0.0.7: creating backup dir: /storage/backups/BACKUP-1
10.0.0.5:3306: detected version 5.7.17-13-57.
Extra-arguments be passed to mysqldump:  --set-gtid-purged=OFF
10.0.0.7: Starting nc -dl 9999 > /storage/backups/BACKUP-1/mysqldump_2017-05-09_072135_mysqldb.sql.gz 2>/tmp/netcat.log.
10.0.0.7: nc started, error log: 10.0.0.7:/tmp/netcat.log.
Backup (mysqldump, storage controller): '10.0.0.5: /usr/bin/mysqldump --defaults-file=/etc/my.cnf --flush-privileges --hex-blob --opt   --set-gtid-purged=OFF  --single-transaction --skip-comments --skip-lock-tables --skip-add-locks --databases mysql |gzip  - | nc 10.0.0.7 9999'.
10.0.0.5: MySQL >= 5.7.6 detected, enabling 'show_compatibility_56'
A progress message will be written every 1 minutes
...

The same applies to xtrabackup. Just change the backup method accordingly. The supported values are xtrabackupfull (full backup) and xtrabackupincr (incremental backup):

$ s9s backup --create --backup-method=xtrabackupfull --cluster-id=1 --nodes=10.0.0.5:3306 --backup-directory=/storage/backups
Job with ID 6 registered.

Take note that an incremental backup requires that there is already a full backup made of the same databases (all or individually specified), else the incremental backup will be upgraded to a full backup.

You can then list out the backups created for this cluster:

$ s9s backup --list --cluster-id=1 --long --human-readable
ID CID STATE     OWNER HOSTNAME CREATED  SIZE FILENAME
 1   1 COMPLETED dba   10.0.0.5 07:21:39 252K mysqldump_2017-05-09_072135_mysqldb.sql.gz
 1   1 COMPLETED dba   10.0.0.5 07:21:43 1014 mysqldump_2017-05-09_072135_schema.sql.gz
 1   1 COMPLETED dba   10.0.0.5 07:22:03 109M mysqldump_2017-05-09_072135_data.sql.gz
 1   1 COMPLETED dba   10.0.0.5 07:22:07  679 mysqldump_2017-05-09_072135_triggerseventsroutines.sql.gz
 2   1 COMPLETED dba   10.0.0.5 07:30:20 252K mysqldump_2017-05-09_073016_mysqldb.sql.gz
 2   1 COMPLETED dba   10.0.0.5 07:30:24 1014 mysqldump_2017-05-09_073016_schema.sql.gz
 2   1 COMPLETED dba   10.0.0.5 07:30:44 109M mysqldump_2017-05-09_073016_data.sql.gz
 2   1 COMPLETED dba   10.0.0.5 07:30:49  679 mysqldump_2017-05-09_073016_triggerseventsroutines.sql.gz

Omit the “--cluster-id=1” option and to see the backup records for all your clusters.

Cluster and node operations

Performing a rolling restart (one node at a time) can be done with a single command line:

$ s9s cluster --rolling-restart --cluster-id=1 --wait
Rolling Restart
| Job  9 RUNNING    [███       ]  31% Stopping 10.0.0.4

For configuration management, we can get a list of configuration options defined inside a node’s my.cnf, and pipe the stdout to grep for filtering:

$ s9s node --list-config --nodes=10.0.0.3 | grep max_
MYSQLD      max_heap_table_size                    64M
MYSQLD      max_allowed_packet                     512M
MYSQLD      max_connections                        500
MYSQLD      wsrep_max_ws_rows                      131072
MYSQLD      wsrep_max_ws_size                      1073741824
mysqldump   max_allowed_packet                     512M

Let’s say we would like to reduce the max_connections. Then, we can use the “node” command option to perform the configuration update as shown in the following example:

$ s9s node --change-config --nodes=10.0.0.3 --opt-group=mysqld --opt-name=max_connections --opt-value=200
Variable 'max_connections' set to '200' and effective immediately.
Persisted change to configuration file /etc/my.cnf.
$ s9s node --change-config --nodes=10.0.0.4 --opt-group=mysqld --opt-name=max_connections --opt-value=200
Variable 'max_connections' set to '200' and effective immediately.
Persisted change to configuration file /etc/my.cnf.
$ s9s node --change-config --nodes=10.0.0.5 --opt-group=mysqld --opt-name=max_connections --opt-value=200
Variable 'max_connections' set to '200' and effective immediately.
Persisted change to configuration file /etc/my.cnf.

As stated in the job response, the changes are effective immediately. So it is not necessary to perform a node restart.

Scaling up and down

Adding a new database node is simple. First, setup a passwordless SSH to the new node:

(clustercontrol)$ ssh-copy-id 10.0.0.9

Then, specify the node’s IP address or hostname together with the cluster identifier (assume we want to add the node into a cluster with ID=2):

(client)$ s9s cluster --add-node --nodes=10.0.0.9 --cluster-id=2 --wait
Add Node to Cluster
| Job  9 FINISHED   [██████████] 100% Job finished.

To remove a node, one would do:

$ s9s cluster --remove-node --nodes=10.0.0.9 --cluster-id=2
Job with ID 10 registered.

Summary

ClusterControl CLI can be nicely integrated with your infrastructure automation tools. It opens a new way to interact and manipulate your databases. It is completely open source and available on GitHub. Go check it out!

MySQL on Docker: Swarm Mode Limitations for Galera Cluster in Production Setups

$
0
0

In the last couple of blog posts on Docker, we have looked into understanding and running Galera Cluster on Docker Swarm. It scales and fails over pretty well, but there are still some limitations that prevent it from running smoothly in a production environment. We will be discussing about these limitations, and see how we can overcome them. Hopefully, this will clear some of the questions that might be circling around in your head.

Docker Swarm Mode Limitations

Docker Swarm Mode is tremendous at orchestrating and handling stateless applications. However, since our focus is on trying to make Galera Cluster (a stateful service) to run smoothly on Docker Swarm, we have to make some adaptations to bring the two together. Running Galera Cluster in containers in production requires at least:

  • Health check - Each of the stateful containers must pass the Docker health checks, to ensure it achieves the correct state before being included into the active load balancing set.
  • Data persistency - Whenever a container is replaced, it has to be started from the last known good configuration. Else you might lose data.
  • Load balancing algorithm - Since Galera Cluster can handle read/write simultaneously, each node can be treated equally. A recommended balancing algorithm for Galera Cluster is least connection. This algorithm takes into consideration the number of current connections each server has. When a client attempts to connect, the load balancer will try to determine which server has the least number of connections and then assign the new connection to that server.

We are going to discuss all the points mentioned above with a great detail, plus possible workarounds on how to tackle those problems.

Health Check

HEALTHCHECK is a command to tell Docker how to test a container, to check that it is still working. In Galera, the fact that mysqld is running does not mean it is healthy and ready to serve. Without a proper health check, Galera could be wrongly diagnosed when something goes wrong, and by default, Docker Swarm’s ingress network will include the “STARTED” container into the load balancing set regardless of the Galera state. On the other hand, you have to manually attach to a MySQL container to check for various MySQL statuses to determine if the container is healthy.

With HEALTHCHECK configured, container healthiness can be retrieved directly from the standard “docker ps” command:

$ docker ps
CONTAINER ID        IMAGE                       COMMAND             CREATED             STATUS                    PORTS
42f98c8e0934        severalnines/mariadb:10.1   "/entrypoint.sh "   13 minutes ago      Up 13 minutes (healthy)   3306/tcp, 4567-4568/tcp

Plus, Docker Swarm’s ingress network will include only the healthy container right after the health check output starts to return 0 after startup. The following table shows the comparison of these two behaviours:

OptionsSample outputDescription
Without HEALTHCHECKHostname: db_mariadb_galera.2
Hostname: db_mariadb_galera.3
Hostname: ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.1.100' (111)
Hostname: db_mariadb_galera.1
Hostname: db_mariadb_galera.2
Hostname: db_mariadb_galera.3
Hostname: ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.1.100' (111)
Hostname: db_mariadb_galera.1
Hostname: db_mariadb_galera.2
Hostname: db_mariadb_galera.3
Hostname: db_mariadb_galera.4
Hostname: db_mariadb_galera.1
Applications will see an error because container db_mariadb_galera.4 is introduced incorrectly into the load balancing set. Without HEALTHCHECK, the STARTED container will be part of the “active” tasks in the service.
With HEALTHCHECKHostname: db_mariadb_galera.1
Hostname: db_mariadb_galera.2
Hostname: db_mariadb_galera.3
Hostname: db_mariadb_galera.1
Hostname: db_mariadb_galera.2
Hostname: db_mariadb_galera.3
Hostname: db_mariadb_galera.4
Hostname: db_mariadb_galera.1
Hostname: db_mariadb_galera.2
Container db_mariadb_galera.4 is introduced correctly into the load balancing set. With proper HEALTHCHECK, the new container will be part of the “active” tasks in the service if it’s marked as healthy.

The only problem with Docker health check is it only supports two exit codes - either 1 (unhealthy) or 0 (healthy). This is enough for a stateless application, where containers can come and go without caring much about the state itself and other containers. With a stateful service like Galera Cluster or MySQL Replication, another exit code is required to represent a staging phase. For example, when a joiner node comes into the picture, syncing is required from a donor node (by SST or IST). This process is automatically started by Galera and probably requires minutes or hours to complete, and the current workaround for this is to configure [--update-delay] and [--health-interval * --health-retires] to higher than the SST/IST time.

For a clearer perspective, consider the following “service create” command example:

$ docker service create \
--replicas=3 \
--health-interval=30s \
--health-retries=20 \
--update-delay=600s \
--name=galera \
--network=galera_net \
severalnines/mariadb:10.1

The container will be destroyed if the SST process has taken more than 600 seconds. While in this state, the health check script will return “exit 1 (unhealthy)” in both joiner and donor containers because both are not supposed to be included by Docker Swarm’s load balancer since they are in syncing stage. After failures for 20 consecutive times at every 30 seconds (equal to 600 seconds), the joiner and donor containers will be removed by Docker Swarm and will be replaced by new containers.

It would be perfect if Docker’s HEALTHCHECK could accept more than exit "0" or "1" to signal Swarm’s load balancer. For example:

  • exit 0 => healthy => load balanced and running
  • exit 1 => unhealthy => no balancing and failed
  • exit 2 => unhealthy but ignore => no balancing but running

Thus, we don’t have to determine SST time for containers to survive the Galera Cluster startup operation, because:

  • Joiner/Joined/Donor/Desynced == exit 2
  • Synced == exit 0
  • Others == exit 1

Another workaround apart setting up [--update-delay] and [--health-interval * --health-retires] to be higher than SST time is you could use HAProxy as the load balancer endpoints, instead of relying on Docker Swarm’s load balancer. More discussion further on.

Data Persistency

Stateless doesn’t really care about persistency. It shows up, serves and get destroyed if the job is done or it is unhealthy. The problem with this behaviour is there is a chance of a total data loss in Galera Cluster, which is something that cannot be afforded by a database service. Take a look at the following example:

$ docker service create \
--replicas=3 \
--health-interval=30s \
--health-retries=20 \
--update-delay=600s \
--name=galera \
--network=galera_net \
severalnines/mariadb:10.1

So, what happens if the switch connecting the three Docker Swarm nodes goes down? A network partition, which will split a three-node Galera Cluster into ‘single-node’ components. The cluster state will get demoted into Non-Primary and the Galera node state will turn to Initialized. This situation turns the containers into an unhealthy state according to the health check. After a period 600 seconds if the network is still down, those database containers will be destroyed and replaced with new containers by Docker Swarm according to the “docker service create” command. You will end up having a new cluster starting from scratch, and the existing data is removed.

There is a workaround to protect from this, by using mode global with placement constraints. This is the preferred way when you are running your database containers on Docker Swarm with persistent storage in mind. Consider the following example:

$ docker service create \
--mode=global \
--constraints='node.labels.type == galera' \
--health-interval=30s \
--health-retries=20 \
--update-delay=600s \
--name=galera \
--network=galera_net \
severalnines/mariadb:10.1

The cluster size is limited to the number of available Docker Swarm node labelled with “type=galera”. Dynamic scaling is not an option here. Scaling up or down is only possible if you introduce or remove a Swarm node with the correct label. The following diagram shows a 3-node Galera Cluster container with persistent volumes, constrained by custom node label “type=galera”:

It would also be great if Docker Swarm supported more options to handle container failures:

  • Don’t delete the last X failed containers, for troubleshooting purposes.
  • Don’t delete the last X volumes, for recovery purposes.
  • Notify users if a container is recreated, deleted, rescheduled.

Load Balancing Algorithm

Docker Swarm comes with a load balancer, based on IPVS module in Linux kernel, to distribute traffic to all containers in round-robin fashion. It lacks several useful configurable options to handle routing of stateful applications, for example persistent connection (so source will always reach the same destination) and support for other balancing algorithm, like least connection, weighted round-robin or random. Despite IPVS being capable of handling persistent connections via option "-p", it doesn’t seem to be configurable in Docker Swarm.

In MySQL, some connections might take a bit longer time to process before it returns the output back to the clients. Thus, Galera Cluster load distribution should use “least connection” algorithm, so the load is equally distributed to all database containers. The load balancer would ideally monitor the number of open connections for each server, and sends to the least busy server. Kubernetes defaults to least connection when distributing traffic to the backend containers.

As a workaround, relying on other load balancers in front of the service is still the recommended way. HAProxy, ProxySQL and MaxScale excel in this area. However, you have to make sure these load balancers are aware of the dynamic changes of the backend database containers especially during scaling and failover.

Summary

Galera Cluster on Docker Swarm fits well in development, test and staging environments, but it needs some more work when running in production. The technology still needs some time to mature, but as we saw in this blog, there are ways to work around the current limitations.

Video: Interview with Krzysztof Książek on the Upcoming Webinar: MySQL Tutorial - Backup Tips for MySQL

$
0
0

We sat down with Severalnines Senior Support Engineer Krzysztof Książek to discuss the upcoming webinar MySQL Tutorial - Backup Tips for MySQL, MariaDB & Galera Cluster.

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

ClusterControl for MySQL Backup

ClusterControl provides you with sophisticated backup and failover features with a point-and-click interface to easily restore your data if something goes wrong. These advanced automated failover and backup technologies ensure your mission critical applications achieve high availability with zero downtime.

ClusterControl allows you to...

  • Create Backups
  • Schedule Backups
  • Set backup configuration method
    • Enable compression
    • Use mysqldump, xtrabackup, or NBD backup
    • Use PIGZ for parallel gzip
  • Backup to multiple locations (including the cloud)
  • Enable automated failover
  • View logs

ClusterControl also offers backup support for MongoDB & PostgreSQL

Learn more about the backup features in ClusterControl for MySQL here.

Viewing all 111 articles
Browse latest View live