Indexing MySQL JSON Data

“MySQL’s JSON data type is great! But how do you index the JSON data?” I was recently presenting at the CakePHP Cakefest Conference and was asked that very question. And I had to admit I had not been able to play, er, experiment with the JSON datatype to that level. Now I have and it is fairly easy.

1. Create a simple table
mysql> desc colors;
+--------------+----------+------+-----+---------+-------------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+----------+------+-----+---------+-------------------+
| popular_name | char(10) | YES | | NULL | |
| hue | json | YES | | NULL | |
+--------------+----------+------+-----+---------+-------------------+
2 rows in set (0.00 sec)

2. Add in some data
INSERT INTO `colors` VALUES ('red','{\"value\": \"f00\"}'),('green','{\"value\": \"0f0\"}'),('blue','{\"value\": \"00f\"}'),('cyan','{\"value\": \"0ff\"}'),('magenta','{\"value\": \"f0f\"}'),('yellow','{\"value\": \"ff0\"}'),('black','{\"value\": \"000\"}');

3. SELECT some data
Use the jsb_extract function to efficiently search for the row desired.
mysql> select jsn_extract(hue, '$.value') from colors where jsn_extract(hue, '$.value')="f0f";
+-----------------------------+
| jsn_extract(hue, '$.value') |
+-----------------------------+
| "f0f" |
+-----------------------------+
1 row in set (0.00 sec)

But how efficient is that? Turns out we end up doing a full table scan.

mysql> explain select jsn_extract(hue, '$.value') from colors where jsn_extract(hue, '$.value')="f0f";
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | colors | NULL | ALL | NULL | NULL | NULL | NULL | 7 | 100.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+

4 Add a VIRTUAL column to index quickly
mysql> ALTER TABLE colors ADD value_ext char(10) GENERATED ALWAYS AS (jsn_extract(hue, '$.value')) VIRTUAL;
This will add a virtual column from the value data in the hue column.

5 Index the New Column
mysql> CRATE INDEX value_ext_index ON colors(value_ext);

Now the EXPLAIN shows us that we are more efficient.
mysql> explain select jsn_extract(hue, '$.value') from colors where value_ext="f0f";
+----+-------------+--------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | colors | NULL | ref | value_ext_index | value_ext_index | 11 | const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

mysql>

JSON and the MySQL Argonauts

The MySQL 5.7.7 JSON lab release has been getting a lot of attention. At a recent conference, I was cornered by a developer who wanted to jump in with both feet by running this release on his laptop on the flight home. Jason and the Argonaughts Movie Poster However the developer was not sure how to begin.

1. Down load the MySQL JSON release from http://labs.mysql.com/. You will get the choice of a Linux binary or source code. Please grab the binary if you are using Linux and un-gzip/tar the download.

2. Shut down the current running version of MySQL. I was lucky in this case that the developer was using a recent copy of Ubuntu.

3. Change directory to the ~/Downloads/mysql-5.7.7-labs-json-linux-el6-x86_64 directory.

4. sudo ./bin/mysqld_safe –user=mysql&

5. ./bin/mysql -u root -p, then provde the password.

6. Enter a \s to get the status. This will confirm that you are using the JSON labs release.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.7-labs-json MySQL Community Server (GPL)

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

If you are doing more than simple tests, run mysql_upgrade to update the system tables. But you can skip this step for a quick and dirty exploration but do not expect your JSON data to be around when you go back to the previous version of MySQL.

7. Now you can start testing the JSON data type. I recommend starting with reading JSON Labs Release: JSON Functions, Part 1 — Manipulation JSON Data, JSON Labs Release: JSON Functions, Part 2 — Querying JSON Data, and JSON Labs Release: Native JSON Data Type and Binary Format. Then follow up with JSON Labs Release: Effective Functional Indexes in InnoDB to understand how to create indexes.

Zero to DBA Track at SELF

MySQL is proud to sponsor the Zero to DBA Track at the Southeast Linuxfest next June 11th-14th! Hotel rooms at the conference rate are going very quickly.

So what do you learn at the Zero to DBA sessions? There are two days of presentations from the MySQL and Postgresql communities. The goal of this track was to take novices and turn them into DBAs or at least give novices a good ‘leg up’ in the world. This is a friendly crowd with lots of other non-database sessions and many great non-tech social functions.

Friday
The Proper Care and
Feeding of MySQL for the
Busy Linux Admin
Dave Stokes

The MySQL Ecosystem
Colin Charles

Binary Log Management
Made Easy With MySQL
Utilities
Charles Bell

Scaling MariaDB and MySQL
Max Mether

To Shard or Not To Shard
Peter Zaitsev

Nulls Make Things Easier?
Bruce Momjian

MariaDB/MySQL Security
Essentials
Colin Charles

Saturday

Common Table Expressions
Richard Hipp

Practical MySQL
Performance Optimization
Peter Zaitsev

MySQL Workbench
Dave Stokes

MySQL Central @ Oracle Open World CFP Closes April 29th!

The Call for papers for MySQL Central @ Oracle Open World closes in just over a week! Now is the time to put together those ideas for a presentation you have had into shape before the last minute rush. The conference starts October 25th but the CFP closes April 29th! But we want you at this conference to present your best material.

So what if you have never submitted a talk in the past to any conference OR have not manged to get accepted for MySQL Central before? Get your best technical material together and write down the topic, 3-5 bullet points, and a conclusion. Then put your title, bullet points, and conclusion into a paragraph. That paragraph needs to fit into one of the six MySQL Central @ OpenWorld tracks:

  • Performance & Scalability
  • High Availability
  • NoSQL & Big Data
  • Cloud and DevOps
  • Database Administration
  • Architecture and Application Development

Submit your talk today here!

For an example, lets pretend your name is Montgomery Scott and you are a chief engineer. You have spent the last few months studying storing data from your transporter in BLOB under various storage engines. The title would obvious be Storing Transporter Data Efficiently with Various Storage Engines. Your main points are 1) It is easy to store transporter data in MySQL, 2) Blobs are used to store data because it is amorphous, 3) You would like to save the data in JSON for better investigation of the information, 4) Use InnoDB for crash recovery and row level locking as MyIsam proved too fragile and had crash recovery issues, and 5) Black Hole storage engine is fine for Tribbles. Your conclusion would be It is easy to store transporter data in MySQL given that you take the proper precautions.

Now smooth that into a paragraph. Get someone with some writing skill or marketing background to help smooth the flow of the sentences.

Storing Transporter Data Efficiently with Various Storage Engines
Data from teleportation transporters is often tricky to store. MySQL can provide fast and efficient storage of transporter data sets. Currently we use blobs due to the amorphous nature of the data but JSON is considered to provide better management and searching of the data. Initial MyIsam usage proved problematic and other storage engines have been successfully used. InnoDB’s features provide better crash recovery for this critical data and simultaneous lookup of data. Use of the Black Hole storage engine has proven to be the best performing for small furry data sets should not be used for no ephemeral information. This session will cover our best practices that we discovered for storing transporter data.

So Scotty can bean up, er, post paper today.

MySQL Fabric – Faulty to Secondary

MySQL Fabric is a great tool for managing a farm of MySQL servers. In the last two posts you can see how High Availability Fail over and Sharding work. When things are working right, the group lookup_servers will look something like this:
server_uuid address status mode weight
------------------------------------ ----------- --------- ---------- ------
09d13be1-cdd0-11e4-8643-0800273b9c80 10.10.13.10 SECONDARY READ_ONLY 1.0
0f611996-cdd0-11e4-8643-0800273b9c80 10.10.13.20 SECONDARY READ_ONLY 1.0
11aae7e7-cdd0-11e4-8643-0800273b9c80 10.10.13.30 PRIMARY READ_WRITE 1.0

This is a three node farm with a PRIMARY for READ-WRITE operations and two SECONDARY servers for READ_ONLY.

But that about the times when one of the servers is not doing do great and shows up as FAULTY on a group lookup_servers or group health report?

uuid is_alive status is_not_running is_not_configured io_not_running sql_not_running io_error sql_error
------------------------------------ -------- --------- -------------- ----------------- -------------- --------------- -------- ---------
09d13be1-cdd0-11e4-8643-0800273b9c80 1 FAULTY 0 0 0 0 False False
0f611996-cdd0-11e4-8643-0800273b9c80 1 SECONDARY 0 0 0 0 False False
11aae7e7-cdd0-11e4-8643-0800273b9c80 1 PRIMARY 0 0 0 0 False False

Possible values for status are FAULTY, SPARE, SECONDARY, PRIMARY, or CONFIGURING. So how is good old server ’09d13be1-cdd0-11e4-8643-0800273b9c80′ changed from FAULTY to SECONDARY?

The path is not as simple as just setting the status to to SECONDARY. First the server must be marked SPARE with mysqlfabric server set_status 09d13be1-cdd0-11e4-8643-0800273b9c80 spare. Use group lookup_servers to note the change of ’09d13be1-cdd0-11e4-8643-0800273b9c80′ to SPARE.

server_uuid address status mode weight
------------------------------------ ----------- --------- ---------- ------
09d13be1-cdd0-11e4-8643-0800273b9c80 10.10.13.10 SPARE OFFLINE 1.0
0f611996-cdd0-11e4-8643-0800273b9c80 10.10.13.20 SECONDARY READ_ONLY 1.0
11aae7e7-cdd0-11e4-8643-0800273b9c80 10.10.13.30 PRIMARY READ_WRITE 1.0

Nut now the mode is OFFLINE. Now the status can be changed to SECONDARY.
mysqlfabric server set_status 09d13be1-cdd0-11e4-8643-0800273b9c80 secondary

And good old ’09d13be1-cdd0-11e4-8643-0800273b9c80′ is back as a happy, productive member of the farm.

Fabric UUID: 5ca1ab1e-a007-feed-f00d-cab3fe13249e
Time-To-Live: 1

server_uuid address status mode weight
------------------------------------ ----------- --------- ---------- ------
09d13be1-cdd0-11e4-8643-0800273b9c80 10.10.13.10 SECONDARY READ_ONLY 1.0
0f611996-cdd0-11e4-8643-0800273b9c80 10.10.13.20 SECONDARY READ_ONLY 1.0
11aae7e7-cdd0-11e4-8643-0800273b9c80 10.10.13.30 PRIMARY READ_WRITE 1.0

MySQL Fabric and Sharding

Last time we set up a High Availability server farm with MySQL Fabric. Now it is time to set up sharding. I will be using the good old World database and sharding the City table on the ID field. There are 4,079 cities in this table and they will be split in two. So one shard, that we will call CityLow will have the records 2,000 and below and the other records at 2,001 and above will be called CityHigh. We also need a global group for setting up sharding that will be called CityGlobal.

Sadly, the first step is to remove the previous setup with mysqlfabric manage teardown. This will remove the fabric database from the Fabric controller. Fabric itself has to be stopped with mysqlfabric manage stop. The command mysqlfabric manage setup will set up a fresh, clean fabric database. And then we can start the Fabric controller with mysqlfabric manage start --daemonize.

Now we crate the groups for CityLow, CityHigh, and CityGlobal.

mysqlfabric group create CityLow
Fabric UUID: 5ca1ab1e-a007-feed-f00d-cab3fe13249e
Time-To-Live: 1

uuid finished success result
------------------------------------ -------- ------- ------
a7cafabf-1550-4f7c-a8e7-79458e47b9d5 1 1 1

state success when description
----- ------- ------------- ------------------------------------------------------------------
3 2 1428436552.17 Triggered by .
4 2 1428436552.2 Executing action (_create_group).
5 2 1428436552.26 Executed action (_create_group).

mysqlfabric group create CityHigh
Fabric UUID: 5ca1ab1e-a007-feed-f00d-cab3fe13249e
Time-To-Live: 1

uuid finished success result
------------------------------------ -------- ------- ------
81a92aa2-1c71-4327-9fc5-4e8098ebff5c 1 1 1

state success when description
----- ------- ------------- ------------------------------------------------------------------
3 2 1428436556.61 Triggered by .
4 2 1428436556.64 Executing action (_create_group).
5 2 1428436556.7 Executed action (_create_group).

mysqlfabric group create CityGlobal
Fabric UUID: 5ca1ab1e-a007-feed-f00d-cab3fe13249e
Time-To-Live: 1

uuid finished success result
------------------------------------ -------- ------- ------
ee83da52-fae2-4355-95b3-1ac58039c8d7 1 1 1

state success when description
----- ------- ------------- ------------------------------------------------------------------
3 2 1428436562.23 Triggered by .
4 2 1428436562.26 Executing action (_create_group).
5 2 1428436562.32 Executed action (_create_group).

Servers are added with mysqlfabric group add CityLow 10.10.13.10, myslfabric group add CityHigh 10.10.13.20, and mysqlfabric group add CityGLobal 10.10.13.30. Then we promote a sever in each group with mysqlfabric group promote CityLow, mysqlfabric group promote CityHigh, and lastly mysqlfabric group promote CityGlobal.

A unique sharding ID is needed and I missed noting this unique number as I did not see it (here the docs are not matching the output of the commands). And this gave me problems with the step following this one.
mysqlfabric sharding create_definition range CityClobal
Fabric UUID: 5ca1ab1e-a007-feed-f00d-cab3fe13249e
Time-To-Live: 1

AttributeError: ‘NoneType’ object has no attribute ‘master’
I ignored the above error and carried on hoping all was well.

And I ran smack into a problem trying tell Fabric which table to use. mysqlfabric sharding add table_table 1 City.World ID The error message I recevied was DatabaseError: Command (INSERT INTO shard_tables(shard_mapping_id, table_name, column_name, range_check) VALUES(%s, %s, %s, %s), (‘1’, ‘World.City’, ‘ID’, False)) failed accessing (localhost:3306). 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`fabric`.`shard_tables`, CONSTRAINT `fk_shard_mapping_id` FOREIGN KEY (`shard_mapping_id`) REFERENCES `shard_maps` (`shard_mapping_id`)).!! I had seen ‘1’ in the docs but that was not working for me. So I peeked in the fabric.shard_maps table and found the magic number ‘2’.

mysqlfabric sharding add_table 2 World.City ID
Fabric UUID: 5ca1ab1e-a007-feed-f00d-cab3fe13249e
Time-To-Live: 1

uuid finished success result
------------------------------------ -------- ------- ------
ab98c0a6-390d-4334-992e-557fc37b7a2d 1 1 1

state success when description
----- ------- ------------- ------------------------------------------------------------------
3 2 1428437358.46 Triggered by .
4 2 1428437358.49 Executing action (_add_shard_mapping).
5 2 1428437358.56 Executed action (_add_shard_mapping).

Now tell Fabric how slice and dice the table into shards. Remember we want two — Low with 0-2000, and High with the rest.
mysqlfabric add_shard 2 CityLow/0,CityHigh/2000 --state=ENABLED
Fabric UUID: 5ca1ab1e-a007-feed-f00d-cab3fe13249e
Time-To-Live: 1

uuid finished success result
------------------------------------ -------- ------- ------
f20da46d-6cc1-4464-a0dd-991b35d83e97 1 1 1

state success when description
----- ------- ------------- ------------------------------------------------------------------
3 2 1428437428.99 Triggered by .
4 2 1428437429.03 Executing action (_add_shard).
5 2 1428437429.3 Executed action (_add_shard).
3 2 1428437429.22 Triggered by .
4 2 1428437429.3 Executing action (_add_shard_range_check).
5 2 1428437429.42 Executed action

So now we check to see which shard hold the record with the ID of 15 and it should be the CityLow shard on 10.10.13.10!
$mysqlfabric sharding lookup_servers World.City 15
Fabric UUID: 5ca1ab1e-a007-feed-f00d-cab3fe13249e
Time-To-Live: 1

server_uuid address status mode weight
------------------------------------ ----------- ------- ---------- ------
09d13be1-cdd0-11e4-8643-0800273b9c80 10.10.13.10 PRIMARY READ_WRITE 1.0

And is ID 3333 on the CityHigh shard?

$mysqlfabric sharding lookup_servers World.City 3333
Fabric UUID: 5ca1ab1e-a007-feed-f00d-cab3fe13249e
Time-To-Live: 1

server_uuid address status mode weight
------------------------------------ ----------- ------- ---------- ------
0f611996-cdd0-11e4-8643-0800273b9c80 10.10.13.20 PRIMARY READ_WRITE 1.0

Yes!

The table is sharded! Next blog will cover some sharding magic.