The Wall Street Journal’s Heard on the Street column this morning was oddly titled Oracle’s Little Issue With Big Data. The point of the article is that data is exploding but the WSJ does not see how Oracle is going to take advantage of that explosion. But I found it odd that WSJ has missed the point!
At a customer conference in Tokyo last week, Oracle President Mark Hurd noted how digital information is growing exponentially, from 1.8 zettabytes last year to an expected 35 zettabytes in 2020. If an 11-ounce cup of coffee represented one gigabyte, according to a Cisco Systems report, then one zettabyte would have the same volume as the Great Wall of China. Everything from digital videos to emails to health records is fueling this.
It seems that the Journal has discovered that people are using using ‘cheap hardware and open-source software’ to store data (big shock to those of us in the MySQL world, eh?). It seems that some folks are using something called a NoSQL database to store data. This is, according to the article, because data like Tweets don’t really fit in a spreadsheet. I guess the recent Twitter MySQL patches slipped by unnoticed.
Yes, data is exploding and why we are not all buying disk drive manufacturer stock should bother me. And yes, some data does not fit in the paradigm of a relational database. But I guess that Oracles efforts from the Exadata, to the Hadoop collector for 11G, and the MySQL memcached/InnoDB & memacached/NDB escaped the attention of the WSJ (despite having many Oracle ads on the front page in two colors to promote the Exadata). The Journal thinks that Oracle has only a hardware only solution while this amorphous Big Data create only inhabits commodity hardware.
Big Data is amorphous because everyone seems to have their own definition. There are applications where a document store works better than a RDMS or map/reduce will meet price and performance goals for an application. But not everything is a key pair and much of our data is relational. The technology has to match the application and the problem that the application is being used to solve. Forcing everyone to write code in on language, keep data in one data storage, and expecting all problems to be the same is not going to work. Big data for an airline is going to be much different than that for a cell phone company or auto parts store or the local pizza parlor.
I have been asking folks at conferences lately what Big Data looks like for them. For the vast majority of those asked, it would all fit rather easily in one corner of an Exadata or a Teradata box with room for several dozen other business. Others could take advantage of a columnar database like Infobright or Infinidb and keep the MySQL look and feel. Others can use Hadoop or Redis to their advantage. But none of the folks I have been talking to have amounts of data that would make a data warehouser break out perspiring.
But the trouble with the Wall Street Journal is that some pointy haired executive who has one or more DBAs below them in an organization chart will read this article, cite it as gospel, and direct that the staff go full speed ahead to thins ‘No squeal Big Data thingy’. So mix the MongoDB is Web Scale with the Dilbert Eunich programmers and we end up with this article catching the eye of a MBA that will end up causing grief for a poor DBA. Please note I have an MBA and expect to hear from some of my former classmates about ‘Dig Bata’ that their boss read about in today’s Journal.