We’re all familiar with the eye-popping, jaw-dropping numbers associated with Big Data. According to Wipro, more than 50 million tweets are generated every day, more than 2.9 million emails are sent every second, 20 hours of video are uploaded every minute, and so much more. We’re so over inundated with data explosion statistics that Big Data is starting to be a Big Yawn.
But wait. Put your virtual reality helmet on and imagine this data storage scenario, courtesy of IDC Analyst Eric Burgener.
Before and After
In yesterday’s world, when our storage needs increased, we added more capacity through products such as disc arrays and tape libraries. Like adding boxcars to a train without adding locomotives, all these lumbering machines that lacked processing power slowed the performance of our systems. We called it the “scale up” model of storage.
Now, imagine an environment where the software that manages the storage infrastructure is independent of the hardware. This policy-based software defines which type of data goes where: Archival material to the tape drives, mission-critical data to the flash drives, and disc drives for everything in between. This software-defined environment changes everything. According to Burgener:
“What this architecture does is let you buy a whole bunch of cheap, x86-based servers, each with it’s own processing power, for $4,000 or $5,000 each, and put some storage in them — flash drives or spinning drives or both. You can get in at $10,000 to $15,000, you never incur the cost of buying this refrigerator-sized array, and you can scale up to a petabyte or more by adding boxes one at a time. You’re also adding performance to help you deal with the additional capacity.”
And all of this adds up to a potential reduction in storage costs by up to 90 percent.
Yottabytes: A Lot of Data, It Is…
IBM calls the type of software-defined environment where the software is separate from the infrastructure as “elastic storage”. It is an offering based on technology used in Watson. And wherever Watson goes, more eye-popping numbers follow: Elastic storage can scan 10 billion files on a single cluster in 43 minutes, claims IBM. IBM also says that this capability can scale that to even thousands of yottabytes.
Underneath the elastic storage offering is IBM’s GPFS (Global Parallel File System) which provides online storage management, scalable access, and integrated data governance tools capable of managing vast amounts of data and billions of files. According to CIO Insight, “CIOs should ultimately focus on elastic storage because the type and quantity of data may also vary at different times of the day or week, or during different seasons.” The ability to automatically move data onto the most economical storage device has the potential to dramatically reduce costs.