Indexing MySQL JSON Data

“MySQL’s JSON data type is great! But how do you index the JSON data?” I was recently presenting at the CakePHP Cakefest Conference and was asked that very question. And I had to admit I had not been able to play, er, experiment with the JSON datatype to that level. Now I have and it is fairly easy.

1. Create a simple table
mysql> desc colors;
+--------------+----------+------+-----+---------+-------------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+----------+------+-----+---------+-------------------+
| popular_name | char(10) | YES | | NULL | |
| hue | json | YES | | NULL | |
+--------------+----------+------+-----+---------+-------------------+
2 rows in set (0.00 sec)

2. Add in some data
INSERT INTO `colors` VALUES ('red','{\"value\": \"f00\"}'),('green','{\"value\": \"0f0\"}'),('blue','{\"value\": \"00f\"}'),('cyan','{\"value\": \"0ff\"}'),('magenta','{\"value\": \"f0f\"}'),('yellow','{\"value\": \"ff0\"}'),('black','{\"value\": \"000\"}');

3. SELECT some data
Use the jsb_extract function to efficiently search for the row desired.
mysql> select jsn_extract(hue, '$.value') from colors where jsn_extract(hue, '$.value')="f0f";
+-----------------------------+
| jsn_extract(hue, '$.value') |
+-----------------------------+
| "f0f" |
+-----------------------------+
1 row in set (0.00 sec)

But how efficient is that? Turns out we end up doing a full table scan.

mysql> explain select jsn_extract(hue, '$.value') from colors where jsn_extract(hue, '$.value')="f0f";
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | colors | NULL | ALL | NULL | NULL | NULL | NULL | 7 | 100.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+

4 Add a VIRTUAL column to index quickly
mysql> ALTER TABLE colors ADD value_ext char(10) GENERATED ALWAYS AS (jsn_extract(hue, '$.value')) VIRTUAL;
This will add a virtual column from the value data in the hue column.

5 Index the New Column
mysql> CRATE INDEX value_ext_index ON colors(value_ext);

Now the EXPLAIN shows us that we are more efficient.
mysql> explain select jsn_extract(hue, '$.value') from colors where value_ext="f0f";
+----+-------------+--------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | colors | NULL | ref | value_ext_index | value_ext_index | 11 | const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

mysql>

Color Your MySQL Console

I started programming in the era of punch cards and learned to love monochrome screens quickly. The standard MySQL client console is a echo of those days. But we can get color.

One of the really interesting parts of my job is looking at the new submissions to Planet.MySQL.Com and seeing what interesting things people are doing with MySQL. Today I found Nicola Strappzaaon’s Swapbytes.com in the approval queue and his top article at the time was Give a little color to the MySQL Console. Actually the tile read Darle un poco de color a la consola de MySQL but Google translate overcame my insufficient skills.

Basically you install grcat ‘universal coulouriser’, install a configuration file, and tweak the .my.cnf file to use grcat as the pager. Nicola’s blog goes into more detail but here is the gist.


apt-get install grc
https://raw.githubusercontent.com/nicola51980/myterm/master/bash/dotfiles/grcat wget -O ~ / .grcat

And the .my.cnf file

[client]
pager = grcat ~/.grcat | less -RSFXin

Colorized MySQL Console
Color Added MySQL Console is quick and easy to do do.

I had change the .grcat configuration file as there was a complaint about a bad color ‘whit’ but replacing the first ‘white’ in the file with red solved that quickly. So change white to red under the section #data in ( ) and ‘ ‘.

So hats off to Nicola Strappzaaon and all the other new bloggers on Planet! And I encourage everyone to check out the different language blogs as you never know what gems you will find there.

MySQL Marinate for the Holiday Season

Just a friendly reminder that you can pick up MySQL Marinate whenever you want! You can use the master list at http://www.meetup.com/Virtual-Tech-Self-Study/messages/boards/thread/38423162 for reference – just ignore the dates and work at your own pace!

We found that very few people were taking the dates to heart, so we stopped trying to organize around them. The message boards are still valid, so feel free to ask if you have any questions – we are here to help!

The above was a quick note from Sheeri Cabral about the wonderful MySQL Marinate program that arrived in my email. This is a great way for novices to learn MySQL and for old war horses to find some new insights.

Does MySQL need a mentoring program?

Does MySQL need a mentoring program? I get calls, emails, and other requests for trained MySQL DBAs and Developers. Human Resources, managers, team leads, and entrepreneurs have the need but can not find the bodies. It is easy enough to download MySQL, get it running, and use it. But the next steps seem problematic for many. There are programs like MySQL Marinate and Girl Develop It to provide some hands on help for beginners. Autodidacts can find tons of MySQL Books and on line information. But how do we take the beginners and get them to intermediate or beyond?

How do we support these new comers, give them a hand if needed, a shoulder to cry on, or just provide someone who has been there before to bounce ideas around when needed? How do we pull them into social networks to warn them of pitfalls, pass on information about new technologies, or just be there as a friendly voice when the air movement device is being impacted by non optimal material? How do we pass on best practices, professional guidance, and the norms of our community? There is only so much forums, IRC, and Stack Overflow can handle. Local users groups are good if you have a local user group.

A good place to start is to see what other Open Source projects are doing. PHP Mentorting is a formal, personal, long term, peer to peer mentorship organization focused on creating networks of skilled developers from all walks of life. Read their info and let me know if you think the MySQL Community needs something similar.

Being a mentor has benefits too. There is an old saying that you really do not know a subject until you can pass on your knowledge to someone else. It also helps bring along someone who could replace you if you decided to climb the corporate ladder. Plus you never know what you fledgling might teach you.

So do we need a MySQL mentoring program?

Changes in MySQL 5.6.20

The MySQL Release Notes should be part of any DBA’s regular reading list. The Changes in MySQL 5.6.20 came out last week and there are some interesting goodies.

  • MySQL now includes DTrace support on Oracle Linux 6 or higher with UEK kernel.
  • A new system variable binlog_impossible_mode controls what happens if the server cannot write to the binary log, for example, due to a file error.
  • The mysqlhotcopy utility is now deprecated and will be removed in a future version of MySQL

5.6.20 has a slew of bug fixes, functionality changes, and notes.

So why should you be reading the changes on a regular basis? There isa goldmine of information in them. For instance, if you use blobs, consider this:

Important Change: Redo log writes for large, externally stored BLOB fields could overwrite the most recent checkpoint. The 5.6.20 patch limits the size of redo log BLOB writes to 10% of the redo log file size. The 5.7.5 patch addresses the bug without imposing a limitation. For MySQL 5.5, the bug remains a known limitation.

As a result of the redo log BLOB write limit introduced for MySQL 5.6, innodb_log_file_size should be set to a value greater than 10 times the largest BLOB data size found in the rows of your tables plus the length of other variable length fields (VARCHAR, VARBINARY, and TEXT type fields). Failing to do so could result in “Row size too large” errors. No action is required if your innodb_log_file_size setting is already sufficiently large or your tables contain no BLOB data. (Bug #16963396, Bug #19030353, Bug #69477)

That is golden information for those of us who used a lot of blobs and great info for configuring servers.

MySQL APT Repository

THe MySQL APT Repository provides an easy and convenient way to get the latest MySQL software. My test server was need of a refresh so I put on a fresh install of Ubuntu 14.04 and downloaded mysql-apt-config_0.2.1-1ubuntu14.04_all.deb.

sudo dpkg -i mysql-apt-config_0.2.1-1ubuntu14.04_all.deb
[sudo] password for dstokes:
Selecting previously unselected package mysql-apt-config.
(snip)

You will get a choice to install MySQL 5.6 or the latest 5.7 DMR.

sudo apt-get update Pulls the latest information from the repository for the various packages.

sudo apt-get install mysql-server Installs the server and will start it running. And then a quick sudo apt-get install mysql-workbench to get me where i needed to be.

There is a detailed information at A Quick Guide to Using the MySQL APT Repository

Triggers — MySQL 5.6 and 5.7

MySQL Triggers are changing in 5.7 in a big way. Triggers have been around since 5.0 and have not changed much up to 5.6 but will gain the ability to have multiple triggers on the same event. Previously you had ONE trigger maximum on a BEFORE UPDATE, for example, and now you can have multiple triggers and set their order.

So what is a trigger? Triggers run either BEFORE or AFTER an UPDATE, DELETE, or INSERT is performed. You also get access to the OLD.col_name and NEW.col_name variables for the previous value and the newer value of the column.

So how do you use a trigger? Let say you are updating the price of an inventory item in a product database with a simple UPDATE statement. But you also want to track when the price change and the old price.

The table for products.
CREATE TABLE products (id INT NOT NULL auto_increment,
price DECIMAL(5,2) NOT NULL,
PRIMARY KEY (id));

The table for price changes on the product table.
CREATE TABLE products_log (id INT NOT NULL,
price DECIMAL(5,2) NOT NULL,
change_date timestamp);

Now to define a trigger that will log price changes. We do this when a price is updated. Now the use od OLD.price to avoid confusion between the old price or the new price being saved in the log.
DELIMITER |
CREATE TRIGGER product_price_logger
BEFORE UPDATE ON products
FOR EACH row
BEGIN
INSERT INTO products_log (id, price)
VALUES (id, OLD.PRICE);
END
|
DELIMITER ;

Add in some data.
INSERT INTO products (price) VALUES (1.10),(2.24),(.99),(.01),(.34);

So UPDATE a record.
UPDATE products SET price='1.11' WHERE ID = 1;

So did it work? Yes, and no. Running SELECT * FROM products_log; Provides us with a time stamp of the change and the OLD.price. But I forgot to also record the id!! Challenge: Correct my mistake and compare it to an update I will make in a few days.

Now 5.7 introduces multiple triggers for the same event. Lets add yet another log this time recording who made the change;

The ‘who made the change table’.
CREATE table who_changed (
id INT NOT NULL,
who_did_it CHAR(30) NOT NULL,
when_did_it TIMESTAMP);

And the second trigger.
DELIMITER |
CREATE TRIGGER product_price_whom
BEFORE UPDATE ON products
FOR EACH ROW
FOLLOWS product_price_logger
BEGIN
INSERT INTO who_changed (id, who_did_it)
VALUES (OLD.id, user());
END
|
DELIMITER ;

So UPDATE products SET price='19.99' WHERE id=4; is run and we see that both triggers execute. Note that SHOW TRIGGERS from schema; does not provide any information on trigger order. But you can find all that as action_order in PERFORMANCE_SCHEMA.TRIGGERS

Being able to order triggers makes it easy to make logical steps when processing data. Can you get into trouble with this? I am certain someone will manage to make a mess with this. But I think most of us will enjoy being able to use this great new functionality.