Illegal mix of collations: IntelliJ and UTF8mb4

When using variables inside SQL scripts within IntelliJ products (e.g. DataGrip), certain queries will not work because the encodings of the IntelliJ client and the server missmatch. This occurs for instance when you compare variables. A typical error message looks like this:

[HY000][1267] Illegal mix of collations (utf8mb4_unicode_520_ci,IMPLICIT) \
   and (utf8mb4_general_ci,IMPLICIT) for operation 'like'

IntelliJ products do not yet support MySQL’s utf8mb4 character set encodings. The problem occurs when using variables in queries. Per default. IntelliJ uses a UTF-8 encoding for the connection. When you use utf8mb4 as the database default character set, then variables will be encoded in UTF-8 while the database content remailns in utf8mb4. It is not possible to provide the character set encodings to the IntelliJ settings, as it will refuse to connect.

Check your server settings using the MySQL client:

MySQL [cropster_research]> show variables like '%char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8mb4                    |
| character_set_connection | utf8mb4                    |
| character_set_database   | utf8mb4                    |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8mb4                    |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

This seems correct, but when you connect with the IntelliJ client, you will get wrong results when you use variables. Until the products support utf8mb4, you would need to add the following settings to the script in order to force the right settings.

SET character_set_connection=utf8mb4;
SET collation_connection=utf8mb4_unicode_520_ci;

Continue reading


Verifying Replication Consistency with Percona’s pt-table-checksum

Replication is an important concept for improving database performance and security. In this blog post, I would like to demonstrate how the consistency between a MySQL master and a slave can be verified. We will create two Docker containers, one for the master one for the slave.

Installing the Percona Toolkit

The Percona Toolkit is a collection of useful utilities, which can be obained for free from the company’s portal. The following commands install the prerequisits, download the package and eventually the package.

sudo apt-get install -y wget libdbi-perl libdbd-mysql-perl libterm-readkey-perl libio-socket-ssl-perl
wget https://www.percona.com/downloads/percona-toolkit/3.0.4/binary/debian/xenial/x86_64/\
    percona-toolkit_3.0.4-1.xenial_amd64.deb
sudo dpkg -i percona-toolkit_3.0.4-1.xenial_amd64.deb 

Setting up a Test Environment with Docker

The following command creates and starts a docker container. Note that these are minimal examples and are not suitable for a serious environment.

docker run --name mysql_master -e MYSQL_ALLOW_EMPTY_PASSWORD=true -d mysql:5.6 --log-bin \
   --binlog-format=ROW --server-id=1

Get the IP address from the master container:

# Get the IP of the master 
docker inspect mysql_master | grep IPAddress 

"SecondaryIPAddresses": null, 
    "IPAddress": "172.17.0.2"

You can connect to this container like this and verify the server id:

stefan@Lenovo ~/Docker-Projects $ mysql -u root -h 172.17.0.2
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.35-log MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show variables like 'server_id';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 1     |
+---------------+-------+
1 row in set (0,00 sec)

We repeat the command for the slave, but use a different id. port and name:

docker run --name mysql_slave -e MYSQL_ALLOW_EMPTY_PASSWORD=true -d mysql:5.6 --server-id=2

For simplicity, we did not use Docker links, but will rather use IP addresses assigned by Docker directly.

Replication Setup

First, we need to setup a user with replication privileges. This user will connect from the slave to the master.

# On the host, interact with the master container
## Get the IP address of the slave container
$ docker inspect mysql_slave | grep IPAddress
            "SecondaryIPAddresses": null,
            "IPAddress": "172.17.0.3",
                    "IPAddress": "172.17.0.3",

## Login to the MySQL console of the master
### Grant permissions
GRANT REPLICATION SLAVE ON *.* TO `replication`@'172.17.0.3' IDENTIFIED BY 'SLAVE-SECRET';
### Get the current binlog position
mysql> SHOW MASTER STATUS;
+-------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| mysqld-bin.000002 | 346 | | | |
+-------------------+----------+--------------+------------------+-------------------+
1 row in set (0,00 sec)

Now log into the slave container and add the connection details for the master:

## Connect to the MySQL Slave instance
$ mysql -u root -h 172.17.0.3

### Setup the slave

mysql> CHANGE MASTER TO   
  MASTER_HOST='172.17.0.2',
  MASTER_PORT=3306,
  MASTER_USER='replication', 
  MASTER_PASSWORD='SLAVE-SECRET',
  MASTER_LOG_FILE='mysqld-bin.000002', 
  MASTER_LOG_POS=346;
Query OK, 0 rows affected, 2 warnings (0,05 sec)

### Start and check
mysql>   start slave;
Query OK, 0 rows affected (0,01 sec)

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.17.0.2
                  Master_User: percona
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysqld-bin.000002
          Read_Master_Log_Pos: 346
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 284
        Relay_Master_Log_File: mysqld-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

Now our simple slave setup is running.

Get some test data

Lets download the Sakila test database and import it into the master. It will be replicated immediately.

wget http://downloads.mysql.com/docs/sakila-db.tar.gz
~/Docker-Projects $ tar xvfz sakila-db.tar.gz 

mysql -u root -h 172.17.0.2 < sakila-db/sakila-schema.sql 
mysql -u root -h 172.17.0.2 < sakila-db/sakila-data.sql

Verify that the data is on the slave as well:

mysql -u root -h 172.17.0.3 -e "USE sakila;SHOW TABLES;"
+----------------------------+
| Tables_in_sakila           |
+----------------------------+
| actor                      |
| actor_info                 |
| address                    |
| category                   |
| city                       |
| country                    |
| customer                   |
...
| store                      |
+----------------------------+

After our setup is completed, we can proceed with Percona pt-table checksum.

Percona pt-table-checksum

The Percona pt-table-checksum tool requires the connection information of the master and the slave in a specific format. This is called the DSN (data source name), which is a coma separated string. We can store this information in a dedicated database called percona in a table called dsns. We create this table on the master. Note that the data gets replicated to the slave within the blink of an eye.

CREATE DATABASE percona;
USE percona;
 
CREATE TABLE `DSN-Table` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `dsn` varchar(255) NOT NULL,
 PRIMARY KEY (`id`)
);

The next step involves creating permissions on the slave and the master!

GRANT REPLICATION SLAVE,PROCESS,SUPER, SELECT ON *.* TO 'percona'@'172.17.0.1' IDENTIFIED BY 'SECRET'; 
GRANT ALL PRIVILEGES ON percona.* TO 'percona'@'172.17.0.1';

The percona user is needed to run the script. Note that the IP address is this time from the (Docker) host, having the IP 172.17.0.1 by default. In real world scenarios, this script would either be run on the master or on the slave directly.

Now we need to add the information about the slave to the table we created. The Percona tool could also read this from the process list, but it is more reliable if we add the information ourselves. To do so, we add a record to the table we just created, which describes the slave DSN:

INSERT INTO percona.DSN-Table VALUES (1,'h=172.17.0.3,u=percona,p=SECRET,P=3306');

The pt-table-checksum tool the connects to the master instance and the the slave. It computes checksums of all databases and tables and compares results. You can use the tool like this:

pt-table-checksum --replicate=percona.checksums --create-replicate-table --empty-replicate-table \
  --recursion-method=dsn=t=percona.DSN_Table -h 172.17.0.2 -P 3306 -u percona -pSECRET
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
09-10T10:13:11      0      0        0       1       0   0.020 mysql.columns_priv
09-10T10:13:11      0      0        3       1       0   0.016 mysql.db
09-10T10:13:11      0      0        0       1       0   0.024 mysql.event
09-10T10:13:11      0      0        0       1       0   0.014 mysql.func
09-10T10:13:11      0      0       40       1       0   0.026 mysql.help_category
09-10T10:13:11      0      0      614       1       0   0.023 mysql.help_keyword
09-10T10:13:11      0      0     1224       1       0   0.022 mysql.help_relation
09-10T10:13:12      0      0      585       1       0   0.266 mysql.help_topic
09-10T10:13:12      0      0        0       1       0   0.031 mysql.ndb_binlog_index
09-10T10:13:12      0      0        0       1       0   0.024 mysql.plugin
09-10T10:13:12      0      0        6       1       0   0.287 mysql.proc
09-10T10:13:12      0      0        0       1       0   0.031 mysql.procs_priv
09-10T10:13:12      0      1        2       1       0   0.020 mysql.proxies_priv
09-10T10:13:12      0      0        0       1       0   0.024 mysql.servers
09-10T10:13:12      0      0        0       1       0   0.017 mysql.tables_priv
09-10T10:13:12      0      0     1820       1       0   0.019 mysql.time_zone
09-10T10:13:12      0      0        0       1       0   0.015 mysql.time_zone_leap_second
09-10T10:13:12      0      0     1820       1       0   0.267 mysql.time_zone_name
09-10T10:13:13      0      0   122530       1       0   0.326 mysql.time_zone_transition
09-10T10:13:13      0      0     8843       1       0   0.289 mysql.time_zone_transition_type
09-10T10:13:13      0      1        4       1       0   0.031 mysql.user
09-10T10:13:13      0      0        1       1       0   0.018 percona.DSN_Table
09-10T10:13:13      0      0      200       1       0   0.028 sakila.actor
09-10T10:13:13      0      0      603       1       0   0.023 sakila.address
09-10T10:13:13      0      0       16       1       0   0.033 sakila.category
09-10T10:13:13      0      0      600       1       0   0.023 sakila.city
09-10T10:13:13      0      0      109       1       0   0.029 sakila.country
09-10T10:13:14      0      0      599       1       0   0.279 sakila.customer
09-10T10:13:14      0      0     1000       1       0   0.287 sakila.film
09-10T10:13:14      0      0     5462       1       0   0.299 sakila.film_actor
09-10T10:13:14      0      0     1000       1       0   0.027 sakila.film_category
09-10T10:13:14      0      0     1000       1       0   0.032 sakila.film_text
09-10T10:13:14      0      0     4581       1       0   0.276 sakila.inventory
09-10T10:13:15      0      0        6       1       0   0.030 sakila.language
09-10T10:13:15      0      0    16049       1       0   0.303 sakila.payment
09-10T10:13:15      0      0    16044       1       0   0.310 sakila.rental
09-10T10:13:15      0      0        2       1       0   0.029 sakila.staff
09-10T10:13:15      0      0        2       1       0   0.020 sakila.store

The result shows a difference in the MySQL internal table for permissions. This is obviously not what we are interested in, as permissions are individual to a host. So we rather exclude the MySQL internal database and also the percona database, because it is not what we are interested in. Also in order to test it the tool works, we delete the last five category assignments from the table  with mysql -u root -h 172.17.0.3 -e “DELETE FROM sakila.film_category WHERE film_id > 995;” and update a row in the city table with 

mysql -u root -h 172.17.0.3 -e "update sakila.city SET city='Innsbruck' WHERE city_id=590;"

Now execute the command again:

pt-table-checksum --replicate=percona.checksums --create-replicate-table --empty-replicate-table \
   --recursion-method=dsn=t=percona.DSN_Table --ignore-databases mysql,percona -h 172.17.0.2 -P 3306 -u percona -pSECRET
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
09-10T10:46:33      0      0      200       1       0   0.017 sakila.actor
09-10T10:46:34      0      0      603       1       0   0.282 sakila.address
09-10T10:46:34      0      0       16       1       0   0.034 sakila.category
09-10T10:46:34      0      1      600       1       0   0.269 sakila.city
09-10T10:46:34      0      0      109       1       0   0.028 sakila.country
09-10T10:46:34      0      0      599       1       0   0.285 sakila.customer
09-10T10:46:35      0      0     1000       1       0   0.297 sakila.film
09-10T10:46:35      0      0     5462       1       0   0.294 sakila.film_actor
09-10T10:46:35      0      1     1000       1       0   0.025 sakila.film_category
09-10T10:46:35      0      0     1000       1       0   0.031 sakila.film_text
09-10T10:46:35      0      0     4581       1       0   0.287 sakila.inventory
09-10T10:46:35      0      0        6       1       0   0.035 sakila.language
09-10T10:46:36      0      0    16049       1       0   0.312 sakila.payment
09-10T10:46:36      0      0    16044       1       0   0.320 sakila.rental
09-10T10:46:36      0      0        2       1       0   0.030 sakila.staff
09-10T10:46:36      0      0        2       1       0   0.027 sakila.store

You see that there is a difference in the tables sakila.city and in the table sakila.film_category. The tool does not report the actual number of differences, but rather the number of different chunks. To get the actual differences, we need to use a different tool, which utilises the checksum table that the previous step created.

Show the differences with pt-tabel-sync

The pt-table-sync tool is the counter part for the pt-table-checksum util. It can print or even replay the SQL statements that would render the slave the same state again to be in sync with the master. We can run a dry-run first, as the tool is potentially dangerous.

pt-table-sync --dry-run  --replicate=percona.checksums --sync-to-master h=172.17.0.3 -P 3306 \
   -u percona -pSECRET --ignore-databases mysql,percona
# NOTE: --dry-run does not show if data needs to be synced because it
#       does not access, compare or sync data.  --dry-run only shows
#       the work that would be done.
# Syncing via replication P=3306,h=172.17.0.3,p=...,u=percona in dry-run mode, without accessing or comparing data
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
#      0       0      0      0 Chunk     08:57:51 08:57:51 0    sakila.city
#      0       0      0      0 Nibble    08:57:51 08:57:51 0    sakila.film_category

With –dry-run, you only see affected tables, but not the actual data because it does not really access the databases tables in question. Use –print additionally or instead of dry-run to get a list:

pt-table-sync --print --replicate=percona.checksums --sync-to-master h=172.17.0.3 -P 3306 \
  -u percona -pSECRET --ignore-databases mysql,percona
REPLACE INTO `sakila`.`city`(`city_id`, `city`, `country_id`, `last_update`) VALUES \
   ('590', 'Yuncheng', '23', '2006-02-15 04:45:25') 
  \ /*percona-toolkit src_db:sakila src_tbl:city  ...
REPLACE INTO `sakila`.`film_category`(`film_id`, `category_id`, `last_update`) VALUES ... 
REPLACE INTO `sakila`.`film_category`(`film_id`, `category_id`, `last_update`) VALUES ('997',... 
REPLACE INTO `sakila`.`film_category`(`film_id`, `category_id`, `last_update`) VALUES ('998', '11 ...
REPLACE INTO `sakila`.`film_category`(`film_id`, `category_id`, `last_update`) VALUES ('999', '3', ...
REPLACE INTO `sakila`.`film_category`(`film_id`, `category_id`, `last_update`) VALUES ('1000', '5', ... 

The command shows how we can rename back from Innsbruck to Yuncheng again and also provides the INSERT statements to get the deleted records back.When we replace –print with –execute, the data gets written to the master and replicated to the slave. To allow this, we need to set the permissions on the master

GRANT INSERT, UPDATE, DELETE ON sakila.* TO 'percona'@'172.17.0.1';
pt-table-sync --execute  --replicate=percona.checksums --check-child-tables \ 
  --sync-to-master h=172.17.0.3 -P 3306 -u percona -pSECRET --ignore-databases mysql,percona
REPLACE statements on sakila.city can adversely affect child table `sakila`.`address` 
   because it has an ON UPDATE CASCADE foreign key constraint. 
   See --[no]check-child-tables in the documentation for more information. 
   --check-child-tables error  while doing sakila.city on 172.17.0.3

This error indicates that updating the city table has consequences, because it is a FK to child tables. In this example, we are bold and ignore this warning. This is absolutely not recommended for real world scenarios.

pt-table-sync --execute  --replicate=percona.checksums --no-check-child-tables \
   --no-foreign-key-checks --sync-to-master h=172.17.0.3 -P 3306 -u percona -pSECRET \ 
   --ignore-databases mysql,percona

The command –no-check-child-tables ignores child tables and the command –no-foreign-key-checks ignores foreign keys.

Run the checksum command again to verify that the data has been restored:

pt-table-checksum --replicate=percona.checksums --create-replicate-table --empty-replicate-table \ 
   --recursion-method=dsn=t=percona.DSN_Table --ignore-databases mysql,percona 
   -h 172.17.0.2 -P 3306 -u percona -pSECRET

            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
09-10T11:24:42      0      0      200       1       0   0.268 sakila.actor
09-10T11:24:42      0      0      603       1       0   0.033 sakila.address
09-10T11:24:42      0      0       16       1       0   0.029 sakila.category
09-10T11:24:42      0      0      600       1       0   0.275 sakila.city
09-10T11:24:42      0      0      109       1       0   0.023 sakila.country
09-10T11:24:43      0      0      599       1       0   0.282 sakila.customer
09-10T11:24:43      0      0     1000       1       0   0.046 sakila.film
09-10T11:24:43      0      0     5462       1       0   0.284 sakila.film_actor
09-10T11:24:43      0      0     1000       1       0   0.036 sakila.film_category
09-10T11:24:43      0      0     1000       1       0   0.279 sakila.film_text
09-10T11:24:44      0      0     4581       1       0   0.292 sakila.inventory
09-10T11:24:44      0      0        6       1       0   0.031 sakila.language
09-10T11:24:44      0      0    16049       1       0   0.309 sakila.payment
09-10T11:24:44      0      0    16044       1       0   0.325 sakila.rental
09-10T11:24:44      0      0        2       1       0   0.029 sakila.staff
09-10T11:24:44      0      0        2       1       0   0.028 sakila.store

0 DIFFS, we are done!

 

 

 

 

 

 

 

 

 

 

 

 

 

Continue reading


Using Hibernate Search with Spring Boot

Spring Boot is a framework, that makes it much easier to develop Spring based applications, by following a convention over configuration principle (while in contrast Spring critics claim that the framework’s principle is rather configuration over everything). In this article, I am going to provide an example how to achieve the following:

  • Create a simple Web application based on Spring Boot
  • Persist and access data with Hibernate
  • Make it searchable with Hibernate Search (Lucine)

I use Eclipse with a Gradle plugin for convenience. MySQL will be our back-end for storing the data. The full example can be obtained from my Github Repository.

Bootstrapping: Create a Simple Spring Boot Webapp

The easiest way to start with Spring Boot is heading over to start.spring.io and create a new project. In this example, I will use Gradle for building the application and handling the dependencies and I add Web and JPA starters.

 

 

Download the archive to your local drive and extract it to a folder. I called the project SearchaRoo.

Import the Project with Eclipse

Import it as an existing Gradle Project in Eclipse by using the default settings. You will end up with a nice little project structure as shown below:

We have a central application starter class denoted SearchaRooAppication.java, package definitions, application properties and even test classes. The great thing with Spring Boot is that it is very simple to start and that you can debug it as every other local Java application. There is no need for remote debugging or complex application server setups.

Prepare the Database

We need a few permissions on our MySQL instance before we can start.

CREATE DATABASE spring_employees;
CREATE USER 'dev'@'localhost' IDENTIFIED BY 'sEcReT';
GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, FILE, REFERENCES, INDEX,
    ALTER, SHOW DATABASES, SUPER, LOCK TABLES, CREATE VIEW, SHOW VIEW 
    on spring_employees.* TO 'dev'@'localhost';
GRANT RELOAD on *.* TO 'dev'@'localhost';
FLUSH PRIVILEGES;

We can then add the connection details into the application.properties file. We will edit this file several times when the complexity of this project increases.

# ===============================
# = JPA / HIBERNATE
# ===============================

# Specify the DBMS
spring.jpa.database = MYSQL

# Show or not log for each sql query
spring.jpa.show-sql = true
spring.datasource.url=jdbc:mysql://127.0.0.1/employees?createDatabaseIfNotExist=true
spring.datasource.username=dev
spring.datasource.password=sEcReT
spring.datasource.driver-class-name=com.mysql.jdbc.Driver

Now the basic database setup is done. We can then start adding model classes.

Getting some Employees on Board

MySQL offers a rather small but well documented sample database called employees, which is hosted on Github.  Obtain and import the data as follows:

git clone https://github.com/datacharmer/test_db.git
cd test_db
mysql -u dev -p sEcReT < employees.sql

The script creates a new schema called employees and you will end up with a schema like this:

In the course of this article, we are going to model this schema with Java POJOs by annotating the entities and the a appropriate fields with JPA.

Dependencies

Before we can start modelling the entities in Java, have a look at the Gradle build file. We include additional dependencies for the MySQL connector and Apache commons.

buildscript {
	ext {
		springBootVersion = '1.5.1.RELEASE'
	}
	repositories {
		mavenCentral()
	}
	dependencies {
		classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}")
	}
}

apply plugin: 'java'
apply plugin: 'eclipse'
apply plugin: 'org.springframework.boot'

jar {
	baseName = 'SearchaRoo'
	version = '0.0.1-SNAPSHOT'
}

sourceCompatibility = 1.8

repositories {
	mavenCentral()
}


dependencies {
	compile('org.springframework.boot:spring-boot-starter-data-jpa')
	compile('org.springframework.boot:spring-boot-starter-web')
	testCompile('org.springframework.boot:spring-boot-starter-test')
	compile("mysql:mysql-connector-java")
	compile('org.apache.commons:commons-lang3:3.5')
}

Modelling Reality

The next step covers modelling the data which we imported with Java POJOs. Obviously this is not the most natural way, because in general you would create the model first and then add data to it, but as we already had the data we decided to go in this direction. In the application.properties file, set the database to the imported employees database and set the Hibernate create property to validate. With this setting, we can confirm that we modelled the Java classed in accordance with the database model defined by the MySQL employees database. 

An example of such a class is shown below, the other classes can be found in the Github repository.

package at.stefanproell.model;

import java.util.Date;
import java.util.List;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.EnumType;
import javax.persistence.Enumerated;
import javax.persistence.Id;
import javax.persistence.OneToMany;

@Entity(name="employees")
public class Employee {
    @Id
    @Column(name="emp_no")
    private int employee_id;
    
    @Column(name="birth_date")
    private Date birthdate;
    
    @Column(name="first_name")
    private String firstname;
    
    @Column(name="last_name")
    private String lastname;
    
    @Column(name="gender",columnDefinition = "ENUM('M', 'F', 'UNKNOWN') DEFAULT 'UNKNOWN'")
    @Enumerated(EnumType.STRING)
    private Gender gender;
    
    @Column(name="hire_date")
    private Date hireDate;

    @OneToMany(mappedBy="employee")
    List<Title> titles; 
    
    @OneToMany(mappedBy="employee")
    List<Department_Employee> employee_department;

    // Setters and getters
    
    
}

Now that we have prepared the data model, our schema is now fixed and does not change any more. We can deactivate the Hibernate based dynamic generation of the database tables and use the Spring database initialization instead.To see if we modelled the data correctly, we import MySQL employee data dump we obtained before and import it into our newly created schema, which maps the Java POJOs.

Importing the Initial Data

In the next step, we import the data from the MySQL employee database into our schema spring_hibernate. This schema contains the tables that Hibernate created for us. The following script copies the data between the two schemata. If you see an error, then there is an issue with your model.

-- The original data is stored in the database called employees
-- Spring created the new schema called spring_employees
USE `spring_employees`;

-- Departments

INSERT INTO `spring_employees`.`departments`
(`dept_no`,
`dept_name`)
SELECT `departments`.`dept_no`,
    `departments`.`dept_name`
FROM `employees`.`departments`;

-- Employees

INSERT INTO `spring_employees`.`employees`
(`emp_no`,
`birth_date`,
`first_name`,
`gender`,
`hire_date`,
`last_name`)
SELECT `employees`.`emp_no`,
    `employees`.`birth_date`,
    `employees`.`first_name`,
    `employees`.`gender`,
    `employees`.`hire_date`,
    `employees`.`last_name`
FROM `employees`.`employees`;

-- Join table 
INSERT INTO `spring_employees`.`dept_emp`
(`emp_no`,
`dept_no`,
`from_date`,
`to_date`)
SELECT 
`dept_emp`.`emp_no`,
    `dept_emp`.`dept_no`,
    `dept_emp`.`from_date`,
    `dept_emp`.`to_date`
FROM `employees`.`dept_emp`;

-- Join table 

INSERT INTO `spring_employees`.`dept_manager`
(
`emp_no`,
`dept_no`,
`from_date`,
`to_date`)
SELECT `dept_manager`.`emp_no`,
    `dept_manager`.`dept_no`,
    `dept_manager`.`from_date`,
    `dept_manager`.`to_date`
FROM `employees`.`dept_manager`;

-- Titles

INSERT INTO `spring_employees`.`titles`
(`emp_no`,
`title`,
`from_date`,
`to_date`)
SELECT `titles`.`emp_no`,
    `titles`.`title`,
    `titles`.`from_date`,
    `titles`.`to_date`
FROM `employees`.`titles`;

-- Salaries

INSERT INTO `spring_employees`.`salaries`
(`emp_no`,
`salary`,
`from_date`,
`to_date`)
SELECT `salaries`.`emp_no`,
    `salaries`.`salary`,
    `salaries`.`from_date`,
    `salaries`.`to_date`
FROM `employees`.`salaries`;

We now imported the data in the database schema that we defined for our project. Spring can load schema and initial data during start-up. So we provide two files, one containing the schema and the other one containing the data. To do that, we create two dumps of the database. One containing the schema only, the other one containing the data only.

mysqldump -u dev -psEcReT --no-data --databases spring_employees > src/main/resources/schema.sql
mysqldump -u dev -psEcReT --no-create-info --databases employees > src/main/resources/data.sql

By deactivating the Hibernate data generation and activating the Spring way, the database gets initialized every time the application starts. Change and edit the following lines in the application.properties

spring.jpa.hibernate.ddl-auto=none
spring.datasource.initialize=true
spring.datasource.schema=classpath:/schema.sql
spring.datasource.data=classpath:/data.sql

Before we can import the data with the scripts, make sure to drop the schema and disable foreign key checks in the schema file and enable them again at the end. Spring ignores the actionable MySQL comments. So your schema file should contain this

DROP DATABASE IF EXISTS `spring_employees`;
SET foreign_key_checks = 0;

// rest of the code

SET foreign_key_checks = 1;

And also insert the two foreign key statements to the data file. Note that the import can take a while. If you are happy with the initialized data, you can deactivate the initialization by setting the variable to false: spring.datasource.initialize=false

The application.properties file meanwhile looks like this:

# ===============================
# = JPA / HIBERNATE
# ===============================

# Specify the DBMS
spring.jpa.database = MYSQL

# Show or not log for each sql query
spring.jpa.show-sql = true
spring.datasource.url=jdbc:mysql://127.0.0.1/spring_employees?createDatabaseIfNotExist=true
spring.datasource.username=dev
spring.datasource.password=sEcReT
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.connectionProperties=useUnicode=true;characterEncoding=utf-8;
spring.jpa.properties.hibernate.dialect = org.hibernate.dialect.MySQL5Dialect
# Do not initialize anything
spring.jpa.hibernate.ddl-auto=none
spring.datasource.initialize=false
spring.datasource.schema=classpath:/schema.sql
spring.datasource.data=classpath:/data.sql
spring.datasource.platform=mysql

Adding Hibernate Search

Hibernate search offers full-text search capabilities by using a dedicated index. We need to add the dependencies to the build file.

dependencies {
	compile('org.springframework.boot:spring-boot-starter-data-jpa')
	compile('org.springframework.boot:spring-boot-starter-web')
	testCompile('org.springframework.boot:spring-boot-starter-test')
	compile("mysql:mysql-connector-java")
	compile('org.apache.commons:commons-lang3:3.5')
	compile("org.hibernate:hibernate-search-orm:5.5.6.Final")
	compile('org.springframework.boot:spring-boot-starter-test')
	compile('org.springframework.boot:spring-boot-starter-logging')
	compile('org.springframework.boot:spring-boot-starter-freemarker')
}

Refresh the gradle file after including the search dependencies.

Adding Hibernate Search Dependencies

In this step, we annotate the model POJO classes and introduce the full-text search index. Hibernate search utilises just a few basic settings to get started. Add the following variables to tne application properties file.

# ===============================
# = HIBERNATE SEARCH
# ===============================

# Spring Data JPA will take any properties under spring.jpa.properties.* and 
# pass them along (with the prefix stripped) once the EntityManagerFactory is 
# created.

# Specify the DirectoryProvider to use (the Lucene Directory)
spring.jpa.properties.hibernate.search.default.directory_provider = filesystem

# Using the filesystem DirectoryProvider you also have to specify the default
# base directory for all indexes (make sure that the application have write
# permissions on such directory)
spring.jpa.properties.hibernate.search.default.indexBase = /tmp/SearchRroo/

Please not that storing the Lucene index in the tmp directory is not the best idea, but for testing we can use this rather futile location. We also use the filesystem to store the index, as this is the simplest approach.

Create a Service

In order to facilitate Hibernate Search on our data, we add a service class, which offers methods for searching. The service uses a configuration, which is injected by Spring during run time. The configuration is very simple.

package at.stefanproell.service;

import javax.persistence.EntityManager;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;



@Configuration
public class HibernateSearchConfiguration {
	private final Logger logger = LoggerFactory.getLogger(HibernateSearchConfiguration.class);

	@Autowired
	private EntityManager entityManager;

	@Bean
	HibernateSearchService hibernateSearchService() {
		HibernateSearchService hibernateSearchService = new HibernateSearchService(entityManager);
		hibernateSearchService.initializeHibernateSearch();
		return hibernateSearchService;
	}
}

The @Configuration is loaded when Spring builds the application context. It provides a bean of our service, which can then be injected into the application. The service itself provides methods for creating and searching the index. In this example, the search method is very simple: it only searches on the first and the last name of an employee and it allows users to make one mistake (distance 1).

package at.stefanproell.service;

import java.util.List;

import javax.persistence.EntityManager;
import javax.persistence.NoResultException;
import javax.persistence.PersistenceContext;

import org.apache.lucene.search.Query;
import org.hibernate.search.jpa.FullTextEntityManager;
import org.hibernate.search.jpa.Search;
import org.hibernate.search.query.dsl.QueryBuilder;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;


import at.stefanproell.model.Employee;
import ch.qos.logback.classic.Logger;

@Service
public class HibernateSearchService  {

    private final Logger logger = (Logger) LoggerFactory.getLogger(HibernateSearchService.class);

    private final EntityManager entityManager;

    @Autowired
    public HibernateSearchService(EntityManager entityManager) {
        super();
        this.entityManager = entityManager;
    }


    public void initializeHibernateSearch() {

        try {
            FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(entityManager);
            fullTextEntityManager.createIndexer().startAndWait();
        } catch (InterruptedException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
    
    @Transactional
    public List<Employee> fuzzySearch(String searchTerm){

        FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(entityManager);
        QueryBuilder qb = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Employee.class).get();
        Query luceneQuery = qb.keyword().fuzzy().withEditDistanceUpTo(1).withPrefixLength(1).onFields("firstname", "lastname")
                .matching(searchTerm).createQuery();

        javax.persistence.Query jpaQuery = fullTextEntityManager.createFullTextQuery(luceneQuery, Employee.class);

        // execute search

        List<Employee> employeeList = null;
        try {
            employeeList  = jpaQuery.getResultList();
        } catch (NoResultException nre) {
            logger.warn("No result found");

        }

        return employeeList;

    
    }
}

The service implementation currently only contains an initialization method, which used for creating the Lucene index on the filesystem. Before we can test the index, we need to have at least one indexed entity. This can be achieved by simply adding the annotation @Indexed to the POJO.

package at.stefanproell.model;

import java.util.Date;
import java.util.List;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.EnumType;
import javax.persistence.Enumerated;
import javax.persistence.Id;
import javax.persistence.OneToMany;

import org.hibernate.search.annotations.Indexed;

@Entity(name="employees")
@Indexed
public class Employee {
...
}

When we start the application now, we can see that Hibernate creates the index and a short check on disk shows that it worked:

$ tree /tmp/SerchaRoo/
/tmp/SerchaRoo/
└── at.stefanproell.model.Employee
    ├── _6.fdt
    ├── _6.fdx
    ├── _6.fnm
    ├── _6_Lucene50_0.doc
    ├── _6_Lucene50_0.pos
    ├── _6_Lucene50_0.tim
    ├── _6_Lucene50_0.tip
    ├── _6.si
    ├── segments_2
    └── write.lock

1 directory, 10 files

So far, we did not tell Hibernate search which fields we want add to the index and thus make them full-text searchable. The following listing shows the annotated @Fields.

package at.stefanproell.model;

import java.util.Date;
import java.util.List;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.EnumType;
import javax.persistence.Enumerated;
import javax.persistence.Id;
import javax.persistence.OneToMany;

import org.hibernate.search.annotations.Analyze;
import org.hibernate.search.annotations.DateBridge;
import org.hibernate.search.annotations.Field;
import org.hibernate.search.annotations.Index;
import org.hibernate.search.annotations.Indexed;
import org.hibernate.search.annotations.Resolution;
import org.hibernate.search.annotations.Store;

@Entity(name="employees")
@Indexed
public class Employee {
	@Id
	@Column(name="emp_no")
	private int employee_id;
	
	@Field(index = Index.YES, analyze=Analyze.NO, store = Store.YES)
	@DateBridge(resolution = Resolution.DAY)
	@Column(name="birth_date")
	private Date birthdate;
	
	@Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
	@Column(name="first_name")
	private String firstname;
	
	@Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
	@Column(name="last_name")
	private String lastname;
	
	@Column(name="gender",columnDefinition = "ENUM('M', 'F', 'UNKNOWN') DEFAULT 'UNKNOWN'")
	@Enumerated(EnumType.STRING)
	private Gender gender;
	
	@Field(index = Index.YES, analyze=Analyze.NO, store = Store.YES)
	@DateBridge(resolution = Resolution.DAY)
	@Column(name="hire_date")
	private Date hireDate;

	@OneToMany(mappedBy="employee")
	List<Title> titles; 
	
	@OneToMany(mappedBy="employee")
	List<Department_Employee> employee_department;

	...
	
}

Starting the application again re-creates the index. Time for some basic searching.

Seaching Fulltext

Hibernate Search offers many features, which are not offered in a similar quality by native databases. One interesting feature is for instance fuzzy search, which allows finding terms within an edit distance of up to two letters. The method for searchin on two fields was already shown above. We can use this method in a small JUnit test:

package at.stefanproell;

import java.util.List;

import org.junit.Test;
import org.junit.runner.RunWith;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;

import at.stefanproell.model.Employee;
import at.stefanproell.service.HibernateSearchService;
import ch.qos.logback.classic.Logger;

@RunWith(SpringRunner.class)
@SpringBootTest
public class SearchaRooApplicationTests {
	private final Logger logger = (Logger) LoggerFactory.getLogger(SearchaRooApplicationTests.class);

	@Autowired
	HibernateSearchService service;
	
	@Test
	public void searchSingleTerm() {
		List<Employee> result =service.fuzzySearch("Muntz");
		logger.info("Found " + result.size() + " employees");
		assert(result.size()>0);
		
		for(Employee emp : result){
			logger.info("Id: "+ emp.getEmployee_id() + " " + emp.getFirstname() + " " + emp.getLastname() );
		}
	}
}

The user made a small typo by entering Chrisu instead of Chris. As we allowed two mistakes, we receive a list of similar names and the test evaluates to passed. Sone possible results are shown below.

Id: 456935 Hiroyasu Christ
Id: 467982 Alselm Christ
Id: 482230 Dinah Christ
Id: 484411 Akeel Christ
Id: 498582 Mihalis Christ
Id: 14298 Chriss Granlund
Id: 14418 Chriss Kadhim
Id: 18643 Christ Bernini
Id: 19919 Chrisa Gornas
Id: 20537 Chrisa Rando

Conclusions

Hibernate Search is a great tool and can be easily integrsted into Spring Boot Applicstions. In this post, I gave a minimalistic example how fulltext fuzzy search can be added to existing databases and allows a flexible and powerful search. A few more advanced thoughts on Hibernate Search are given in this blog post here. The Hibernate Search documentation contains a lot of useful and more elaborate examples. The full example can be obtained on Github.

Continue reading


Deploying MySQL in a Local Development Environment

Installing MySQL via apt-get is a simple task, but the migration between different MySQL versions requires planning and testing. Thus installing one central instance of the database system might not be suitable, when the version of MySQL or project specific settings should be switched quickly without interfering with other applications. Using one central instance can quickly become cumbersome. In this article, I will describe how any number of MySQL instances can be stored and executed from within a user’s home directory.

Adapting MySQL Data an Log File Locations

Some scenarios might require to run several MySQL instances at once, other scenarios cover sensitive data, where we do not want MySQL to write any data on non-encrypted partitions. This is especially true for devices which can get easily stolen, for instance laptops.  If you use a laptop for developing your applications from time to time, chances are good that you need to store sensitive data in a database, but need to make sure that the data is encrypted when at rest. The data stored in a database needs to be protected when at rest.

This can be solved with full disk encryption, but this technique has several disadvantages. First of all, full disk encryption only utilises one password. This entails, that several users who utilise a device need to share one password, which reduces the reliability of this approach. Also when the system needs to be rebooted, full disk encryption can become an obstacle, which increases the complexity further.

Way easier to use is the transparent home directory encryption, which can be selected during many modern Linux setup procedures out of the box. We will use this encryption type for this article, as it is reasonable secure and easy to setup. Our goal is to store all MySQL related data in the home directory and run MySQL with normal user privileges.

Creating the Directory Structure

The first step is creating a directory structure for storing the data. In this example, the user name is stefan, please adapt to your needs.

$ cd ~
$ mkdir -p MySQL-5.6-Local/MySQL-5.6-Data # Data storage
$ mkdir -p MySQL-5.6-Local/MySQL-5.6-Conf # Config file
$ mkdir -p MySQL-5.6-Local/MySQL-5.6-Log # Log file
$ mkdir -p MySQL-5.6-Local/MySQL-5.6-Tmp # Tmp
$ mkdir -p MySQL-5.6-Local/MySQL-5.6-Scripts # Start and stop scripts

Create a Configuration File

Make sure to use absolute paths and utilise the directories we created before. Store this file in MySQL-5.6-Local/MySQL-5.6-Conf/my-5.6.cnf. The configuration is pretty self explanatory.

[client]
port        = 3306
socket      = /home/stefan/MySQL-5.6-Local/mysqld-5.6.sock

[mysqld_safe]
pid-file    = /home/stefan/MySQL-5.6-Local/mysqld-5.6.pid
nice        = 0
log_error=/home/stefan/MySQL-5.6-Local/MySQL-5.6-Log/mysql_error.log

[mysqld]
user        = stefan
pid-file    = /home/stefan/MySQL-5.6-Local/mysqld-5.6.pid
socket      = /home/stefan/MySQL-5.6-Local/mysqld-5.6.sock
port        = 3306
basedir     = /usr
datadir     = /home/stefan/MySQL-5.6-Local/MySQL-5.6-Data
tmpdir      = /home/stefan/MySQL-5.6-Local/MySQL-5.6-Tmp
log_error=/home/stefan/MySQL-5.6-Local/MySQL-5.6-Log/mysql_error.log
bind-address = 127.0.0.1

Stop the Running MySQL Instance

If you already have a running MySQL instance, make sure to shut it down. You can also disable MySQL from starting automatically.

sudo service msql stop
echo "manual" | sudo tee /etc/init/mysql.override

Setting up Apparmor

Apparmor protects sensitive applications by defining in which directory they might write. We need to update this configuration to suit our needs. We need to make sure that the global configuration file for the central MySQL instance also includes an additional local information. Edit this file first: /etc/apparmor.d/usr.sbin.mysqld and make sure that the reference for the local file is not commented out.

# vim:syntax=apparmor
# Last Modified: Fri Feb 28 18:06:30 2014
#include <tunables/global>
 
 
/usr/sbin/mysqld {
  #include <abstractions/base>
  #include <abstractions/nameservice>
  #include <abstractions/user-tmp>
 
# .... shortened
 
  # Include local configuration!!
  include <local/usr.sbin.mysqld>
 
}

Now we need to add the directories in stean’s home directory to the local file by editing /etc/apparmor.d/local/usr.sbin.mysqld .

# Site-specific additions and overrides for usr.sbin.mysqld. 
# For more details, please see /etc/apparmor.d/local/README. 
# Site-specific additions and overrides for usr.sbin.mysqld. 
# For more details, please see /etc/apparmor.d/local/README. 

/home/stefan/MySQL-5.6-Local/ r, 
/home/stefan/MySQL-5.6-Local/** rwk,

An incorrect Apparmor configuration is often the cause of permission errors, which can be a pain. Make sure to reload the the Apparmor service and if you struggle with it, consider to disable it temporarily and check if the rest works. Do not forget to turn it on again.

sudo service apparmor reload # reload
sudo service apparmor teardown # disable for testing

Initialize the Local MySQL Instance

Now it is time to initialize the MySQL instance. In this step, MySQL creates all the files it needs in the data directory. It is important that the data directory is empty, when you initiate the following commands.

sudo mysql_install_db --defaults-file=/home/stefan/MySQL-5.6-Local/MySQL-5.6-Conf/my-5.6.cnf --datadir=/home/stefan/MySQL-5.6-Local/MySQL-5.6-Data --basedir=/usr

Note that this command is marked as deprecated. It works with MySQL 5.6 and MySQL 5.7, but can be removed.

Start and Stop the Instance

You can now start the MySQL instance with the following command:

mysqld_safe --defaults-file=/home/stefan/MySQL-5.6-Local/MySQL-5.6-Conf/my-5.6.cnf

For your convenience, add a custom client configuration in your $HOME/.my.cnf and point it to the user defined socket.

[client]
port        = 3306
socket      = /home/stefan/MySQL-5.6-Local/mysqld-5.6.sock

In addition, startup and shutdown scripts are useful as well. Place both scripts in the directory we created before and add execution permissions with chmod +x .

# Startup script
mysqld_safe --defaults-file=/home/stefan/MySQL-5.6-Local/MySQL-5.6-Conf/my-5.6.cnf --timezone=utc

The stop script is similar.

# Shutdown script
mysqladmin -u root -S /home/stefan/MySQL-5.6-Local/mysqld-5.6.sock shutdown

Conclusion

The technique described above allows to install and run multiple MySQL instances from within the user’s home directory. The MySQL instances run with user privileges and can utilise dedicated data and log file directories. As the all data is stored within the $HOME directory, we can easily apply transparent encryption to protect data at rest.

Continue reading


A MySQL 5.7 Cluster Based on Ubuntu 16.04 LTS – Part 2

In a recent article, I described how to setup a basic MySQL Cluster with two data nodes and a combined SQL and management node. In this article, I am going to highlight a hew more things and we are going to adapt the cluster a little bit.

Using Hostnames

For making our lives easier, we can use hostnames which are easier to remember than IP addresses. Hostnames can be specified for each VM in the file /etc/hosts . For each request to the hostname, the operating system will lookup the corresponding IP address. We need to change this file on all three nodes to the following example:

root@node1:~# cat /etc/hosts
127.0.0.1	localhost
192.168.0.81	node1
192.168.0.82	node2
192.168.0.83	node3


# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Now we can lookup IP addresses by the associated hostname.

Adapting Node1

The management and the SQL node are both running in the same VM. Now that we introduced the hostnames for all machines, we need to reconfigure the cluster manager and the MySQL Server. The cluster configuration is stored in the file /opt/mysql/server-5.7/mysql-cluster/config.ini . Replace all the IP addresses with the corresponding host name.

root@node1:~# cat /opt/mysql/server-5.7/mysql-cluster/config.ini
[ndb_mgmd]
NodeId=1
# Management process options:
hostname= node1  # Hostname of the manager
datadir=/opt/mysql/server-5.7/mysql-cluster/cluster-data  # Directory for the log files

[ndbd]
NodeId=2
hostname= node2    # Hostname of the first data node
datadir=/usr/local/mysql/data   # Remote directory for the data files

[ndbd]
NodeId=3
hostname= node3
datadir=/usr/local/mysql/data   # Remote directory for the data files

[mysqld]
# SQL node options:
NodeId=4
hostname=node1

For reconfiguring the SQL node, we need to adapt the file /etc/my.cnf . We need to replace all IP addresses in this file too.

root@node1:~# cat /etc/my.cnf 
[mysqld_safe]
socket          = /tmp/mysql.sock
nice            = 0

[mysqld]
#
# * Basic Settings
#

user            = mysql
pid-file        = /tmp/mysql.pid
socket          = /tmp/mysql.sock
port            = 3306
basedir         = /opt/mysql/server-5.7
datadir         = /opt/mysql/server-5.7/data
tmpdir          = /tmp
skip-external-locking
bind-address=node1

key_buffer_size         = 16M
max_allowed_packet      = 16M
thread_stack            = 192K
thread_cache_size       = 8
query_cache_limit       = 1M
query_cache_size        = 16M

log_error = /var/log/mysql/error.log
expire_logs_days        = 10
max_binlog_size   = 100M

#
# ssl-ca=/etc/mysql/cacert.pem
# ssl-cert=/etc/mysql/server-cert.pem
# ssl-key=/etc/mysql/server-key.pem


sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES 

ndbcluster # run NDB storage engine
ndb-connectstring=node1

[mysql_cluster]
# Options for MySQL Cluster processes:
ndb-connectstring=node1 

If done, stop the cluster and the SQL node like this:

root@node1:~# /opt/mysql/server-5.7/bin/ndb_mgm -e shutdown
Connected to Management Server at: node1:1186
3 NDB Cluster node(s) have shutdown.
Disconnecting to allow management server to shutdown.

root@node1:~# service mysql stop

Adapt the Data Nodes

Replacing the IP adresses with hostnames is pretty straightforward. Change the IP address on each node with the host name in the file /etc/my.cnf :

[mysqld]
# Options for mysqld process:
ndbcluster

[mysql_cluster]
ndb-connectstring=node1

Startup the Cluster

Now that we have exchanged all IP addresses for the hostname, we can restart the cluster as follows:

# Login into node1
root@node1:~# /opt/mysql/server-5.7/bin/ndb_mgmd --reload -f /opt/mysql/server-5.7/mysql-cluster/config.ini
MySQL Cluster Management Server mysql-5.7.16 ndb-7.5.4

# Login into each data node and restart the ndbd process
stefan@node2:~$ sudo /opt/mysql/server-5.7/bin/ndbd
stefan@node3:~$ sudo /opt/mysql/server-5.7/bin/ndbd

# Start the MySQL instance
root@node1:~# service mysql start

Now the cluster should be up again. Verify it like this:

root@node1:~# /opt/mysql/server-5.7/bin/ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: node1:1186
Cluster Configuration
---------------------
[ndbd(NDB)]	2 node(s)
id=2	@192.168.0.82  (mysql-5.7.16 ndb-7.5.4, Nodegroup: 0, *)
id=3	@192.168.0.83  (mysql-5.7.16 ndb-7.5.4, Nodegroup: 0)

[ndb_mgmd(MGM)]	1 node(s)
id=1	@192.168.0.81  (mysql-5.7.16 ndb-7.5.4)

[mysqld(API)]	1 node(s)
id=4	@192.168.0.81  (mysql-5.7.16 ndb-7.5.4)

ndb_mgm> 

Encrypt Data in Transit

For improving the security of the cluster, we can encrypt the traffic which is exchanged between the MySQL node and the MySQL client. Please note that this encryption method is not specific for the cluster, but rather encrypts the traffic between the MySQL client and the MySQL server. You can use this method also for standard MySQL server setups. It does not cover encryption between the data nodes. By design, data nodes are intended to be run in a private network, not via the open Interwebs.

Create a CA Authority Key and Certificate

To do so, we work on node 1. In the first step we create a CA authority and a CA certificate.

# Change into the installation directory
root@node1: cd /opt/mysql/server-5.7
# Create a folder called certs
root@node1:/opt/mysql/server-5.7# mkdir certs
# Enter the folder
root@node1:/opt/mysql/server-5.7# cd certs/
# Create the CA key
root@node1:/opt/mysql/server-5.7/certs# openssl genrsa 2048 > ca-key.pem
# Using that key, create a new CA certificate valid for 365 days. Leave the password empty 
# Example data to provide in the next step:
## Country Name (2 letter code) [AU]:AT
## State or Province Name (full name) [Some-State]:Tirol
## Locality Name (eg, city) []:Innsbruck
## Organization Name (eg, company) [Internet Widgits Pty Ltd]:Acme
## Organizational Unit Name (eg, section) []:Infra
## Common Name (e.g. server FQDN or YOUR name) []:ca-node
## Email Address []:test@example.org 

openssl req -new -x509 -nodes -days 365 -key ca-key.pem -out ca-cert.pem



Create a Server Certificate

In the second step, we create a server certificate and sign it with the CA from the previous step.

# Create a new key
root@node1:/opt/mysql/server-5.7/certs#  openssl req -newkey rsa:2048 -days 365 -nodes -keyout server-key.pem -out server-req.pem
# Convert the key 
root@node1:/opt/mysql/server-5.7/certs# openssl rsa -in server-key.pem -out server-key.pem
# Sign it with the CA
openssl x509 -req -in server-req.pem -days 365 -CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 -out server-cert.pem

The server key is ready. We now need to create a client certificate.

Create a Client Certificate

In a rather similar fashion, we create a client certificate.

# Create the certificate
openssl req -newkey rsa:2048 -days 365 -nodes -keyout client-key.pem -out client-req.pem
# Create the key, again passwordless
openssl req -newkey rsa:2048 -days 365 -nodes -keyout client-key.pem -out client-req.pem
# Sign it
openssl x509 -req -in client-req.pem -days 365 -CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 -out client-cert.pem
# Convert it
openssl rsa -in client-key.pem -out client-key.pem

Now we have created the keys and can configure MySQL for using SSL encryption.

Configure MySQL to use Encryption

Add the following configuration parameters to the /etc/my.cnf file in order to define server and client certificates and keys.

# MySQL Server
[mysqld]
ssl-ca = /opt/mysql/server-5.7/certs/ca-cert.pem
ssl-key = /opt/mysql/server-5.7/certs/server-key.pem
ssl-cert = /opt/mysql/server-5.7/certs/server-cert.pem

# MySQL Client Configuration
[mysql]
ssl-ca = /opt/mysql/server-5.7/certs/ca-cert.pem
ssl-key = /opt/mysql/server-5.7/certs/client-key.pem
ssl-cert = /opt/mysql/server-5.7/certs/client-cert.pem

Restart the MySQL server and test the SSL encryption. You can immediately see that the client uses SSL, although it does not really make a lot of sense for the local user.

root@node1:/opt/mysql/server-5.7/certs# mysql -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 5.7.16-ndb-7.5.4-cluster-gpl MySQL Cluster Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

You are enforcing ssl conection via unix socket. Please consider
switching ssl off as it does not make connection via unix socket
any more secure.
mysql> 

Let’s create a new test user, allow remote access from 192.168.0.12 and enforce SSL connections:

mysql> CREATE USER 'ssl-test'@'192.168.0.12' IDENTIFIED BY 'obscure' REQUIRE SSL;
Query OK, 0 rows affected (0.01 sec)

Switch to the console of the machine 192.168.0.12 and try to connect. You can then display some variables about the SSL configuration.

stefan@192.168.0.12:~$ mysql -p -h 192.168.0.81 -u ssl-test -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 11
Server version: 5.7.16-ndb-7.5.4-cluster-gpl MySQL Cluster Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 

mysql> show variables like '%ssl%';
mysql>  show variables like '%ssl%';
+---------------+---------------------------------------------+
| Variable_name | Value                                       |
+---------------+---------------------------------------------+
| have_openssl  | YES                                         |
| have_ssl      | YES                                         |
| ssl_ca        | /opt/mysql/server-5.7/certs/ca-cert.pem     |
| ssl_capath    |                                             |
| ssl_cert      | /opt/mysql/server-5.7/certs/server-cert.pem |
| ssl_cipher    |                                             |
| ssl_crl       |                                             |
| ssl_crlpath   |                                             |
| ssl_key       | /opt/mysql/server-5.7/certs/server-key.pem  |
+---------------+---------------------------------------------+
9 rows in set (0,01 sec)

To verify that you are actually using SSL encryption, the following command helps, it gives SSL: Cipher in use is DHE-RSA-AES256-SHA :

mysql> status
--------------
mysql  Ver 14.14 Distrib 5.7.16, for Linux (x86_64) using  EditLine wrapper

Connection id:		11
Current database:	
Current user:		ssl-test@192.168.0.12
SSL:			Cipher in use is DHE-RSA-AES256-SHA
Current pager:		stdout
Using outfile:		''
Using delimiter:	;
Server version:		5.7.16-ndb-7.5.4-cluster-gpl MySQL Cluster Community Server (GPL)
Protocol version:	10
Connection:		192.168.0.81 via TCP/IP
Server characterset:	latin1
Db     characterset:	latin1
Client characterset:	utf8
Conn.  characterset:	utf8
TCP port:		3306
Uptime:			32 min 36 sec

Threads: 4  Questions: 14  Slow queries: 0  Opens: 114  Flush tables: 2  Open tables: 9  Queries per second avg: 0.007
--------------

What about the Data Nodes?

So far we encrypted only the traffic between the MySQL client and the MySQL server. This configuration does not differ from a single server or replication setup and does not include the traffic between the data nodes, which is not encrypted at all. MySQL cluster has been designed to be run in a controlled environment within high speed network locally. The MySQL Cluster FAQ states:

It is very unlikely that a cluster would perform reliably under such conditions, as NDB Cluster was designed and implemented with the assumption that it would be run under conditions guaranteeing dedicated high-speed connectivity such as that found in a LAN setting using 100 Mbps or gigabit Ethernet—preferably the latter. We neither test nor warrant its performance using anything slower than this.

Also, it is extremely important to keep in mind that communications between the nodes in an NDB Cluster are not secure; they are neither encrypted nor safeguarded by any other protective mechanism. The most secure configuration for a cluster is in a private network behind a firewall, with no direct access to any Cluster data or management nodes from outside.

 

 

 

Continue reading