Information On Demand 2011: Steve Mills On Big Data
Greetings from the Mandalay Bay Hotel and Convention Center in Viva Las Vegas, Nevada.
I’m pretty sure I saw Elvis in the hallway yesterday, joined by Marilyn Monroe, and they were taking pictures with IODers.
My mom would have been proud (Elvis used to write on her arm after shows at the Louisiana Hayride), but I was too busy getting my fill of big data.
Speaking of which, BBC presenter Katty Cay returned in this morning’s general session to remind us of some big data statistics, including this one: There are now over 34K Google searches per second!
And in our Information On Demand polling overnight, the most popular name at IOD 2011 was tomorrow’s keynote speaker and Moneyball author, Michael Lewis. We’re all looking forward to his discussion with Oakland A’s manager Billy Beane.
And I, of course, will continue to root on my Texas Rangers as they go 3-2 in the World Series against the St. Louis Cardinals.
Now, enter Steve Mills on the big stage at IOD to tell us more about Big Data.
In his keynote session, Mills explained that we’re all living in a world where the reality is that the art of the possible has only been improving with the advent of new technologies.
Mills recalled the days when he had to pick up extra RAM — all 128KB of it — to pick up from Endicott, NY, to deliver to IBM customers in Albany.
Nobody talks about data or RAM in terms of “Ks” anymore — these days, we’re talking petabytes.
The challenge, Mills suggested, is that we can now turn all that additional data into useful information, to hone in to identify patterns and relationships and what the data could be telling us.
It’s like mining for gold, Mills went on, but there’s a lot of dirt and rock you have to remove to get to get to the “vein.”
Mills explained that though data is increasing in volume, it’s also metamorphosing in a way: Data is no longer a static thing, but that increasingly we’re dealing with “data in motion.” Think about traffic data, or sensor outputs from pipelines — the stream is never-ending, so the data is always moving.
There’s also the issue of variety we have to contend with, Mills explained: We’re dealing in all kinds of data types, from audio to video, and certainly no longer just numbers and text.
The big data challenge, then, is how to take advantage of all the possibilities, including high performance hardware and rich bandwidth, and pull together comprehensive solutions to enable governments and businesses to deal effectively with this new volume.
Watson, the IBM computing system that won the “Jeopardy!” match earlier in the year, is a good example of how all these different capabilities can come together. It included big data technologies like Hadoop, as well as DB2, language understanding, and an alert system that allowed Watson to iterate and improve. It was a system of elements brought together to target a specific problem.
Which is exactly what we’re doing with our customers, Mills explained.
Take Catalina Marketing, a supermarket chain that deployed real-time analysis of current transactions and past purchasing history to trigger printouts of customer specific offers — that’s some 300 million retail transactions per week, and some 195 million shipper households and 400+ billion market-based records!
The solution: IBM Netezza, which allows them to do real-time database analytics.
Or Banco Bilvao Vizcaya Argentaria (BBVA), which deployed IBM Cognos Consumer Insight based on IBM InfoSphere BigInsights and Apache Hadoop to analyze internet and social media sentiment (5.8 terabytes of data) about the bank.
Mills went through several more examples, and his message was this: No problem is the same.
There is a constant need for customization, which IBM solutions can provide.
But, patterns do emerge and you can deal with them creatively, and it does require a very broad range of technical capability up and down the line.
“Let’s have a great big data day,” Mills concluded.
Blogger’s Note: Read this blog post by Steve Mills to learn more about the opportunities and challenges presented by Big Data.