Posts Tagged ‘big data’
Big Moves In Big Data: IBM New Data Acceleration, Hadoop Capabilities

Click to enlarge. IBM just announced new technologies designed to help companies and governments tackle Big Data by making it simpler, faster and more economical to analyze massive amounts of data. New data acceleration innovation results in as much as 25 times faster reporting and analytics.
IBM made a significant announcement earlier today concerning new technologies designed to help companies and governments tackle Big Data by making it simpler, faster and more economical to analyze massive amounts of data. The new data acceleration innovation results in as much as 25 times faster reporting and analytics.
Today’s announcement, which represents the work of hundreds of IBM developers and researchers in labs around the world, includes an industry-first innovation called “BLU Acceleration,” which combines a number of techniques to dramatically improve analytical performance and simplify administration.
Also announced was the new IBM PureData System for Hadoop, designed to make it easier and faster to deploy Hadoop in the enterprise. Hadoop is the game-changing open-source software used to organize and analyze vast amounts of structured and unstructured data, such as posts to social media sites, digital pictures and videos, online transaction records, and cell phone location data.
The new system can reduce from weeks to minutes the ramp-up time organizations need to adopt enterprise-class Hadoop technology with powerful, easy-to-use analytic tools and visualization for both business analysts and data scientists.
In addition, it provides enhanced Big Data tools for monitoring, development and integration with many more enterprise systems.
IBM Big Data Innovations: More Accessible, Enterprise-ready
As organizations grapple with a flood of structured and unstructured data generated by computers, mobile devices, sensors and social networks, they’re under unprecedented pressure to analyze much more data at faster speeds and at lower costs to help deepen customer relationships, prevent threat and fraud, and identify new revenue opportunities.
BLU Acceleration enables users to have much faster access to key information, leading to better decision-making. The software extends the capabilities of traditional in-memory systems — which allows data to be loaded into Random Access Memory instead of hard disks for faster performance — by providing in-memory performance even when data sets exceed the size of the memory.
During testing, some queries in a typical analytics workload were more than 1000 times faster when using the combined innovations of BLU Acceleration.
Innovations in BLU Acceleration include “data skipping,” which allows the ability to skip over data that doesn’t need to be analyzed, such as duplicate information; the ability to analyze data in parallel across different processors; and greater ability to analyze data transparently to the application, without the need to develop a separate layer of data modeling.
Another industry-first advance in BLU Acceleration is called “actionable compression,” where data no longer has to be decompressed to be analyzed.
Not IBM’s First Big Data Rodeo
The new offerings expand what is already the industry’s deepest portfolio of Big Data technologies and solutions, spanning software, services, research and hardware. The IBM Big Data platform combines traditional data warehouse technologies with new Big Data techniques, such as Hadoop, stream computing, data exploration, analytics and enterprise integration, to create an integrated solution to address these critical needs.
IBM PureData System for Hadoop is the next step forward in IBM’s overall strategy to deliver a family of systems with built-in expertise that leverages its decades of experience reducing the cost and complexity associated with information technology.
This new system integrates IBM InfoSphere BigInsights, which allows companies of all sizes to cost-effectively manage and analyze data and add administrative, workflow, provisioning and security features, along with best-in-class analytical capabilities from IBM Research.
Today’s announcement also includes the following new versions of IBMs Big Data solutions:
- A new version of InfoSphere BigInsights, IBM’s enterprise-ready Hadoop offering, which makes it simpler to develop applications using existing SQL skills, compliance security and high availability features vital for enterprise applications. BigInsights offers three entry points: free download, enterprise software and now an expert integrated system, IBM PureData System for Hadoop.
- A new version of InfoSphere Streams, unique “stream computing” software that enables massive amounts of data in motion to be analyzed in real-time, with performance improvements, and simplified application development and deployment.
- A new version of Informix including TimeSeries Acceleration for operational reporting and analytics on smart meter and sensor data.
Pricing and Availability
All offerings are available in Q2, except the PureData System for Hadoop, which will start shipping to customers in the second half 2013. Credit-qualified clients can take advantage of simple, flexible lease and loan packages with no up-front payments for the software and systems that deliver a new generation of data analytics.
IBM Global Financing offers attractive leasing programs with 90-day payment deferrals for the PureData System for Hadoop, as well as zero percent loans for the broader portfolio of IBM big data solutions.
Blue Water
Population growth, massive urbanization and climate change are placing increasing demands on our limited water supply. Forty one percent of the world’s population – that’s 2.3 billion people – live in water-stressed areas; this number is expected to grow to 3.5 billion by 2025.
And according to the United Nations, water use has been growing at more than twice the rate of population increase over the last century.
With advances in technology — deep computing and Big Data analytics linked to sophisticated sensor networks and smart meters — IBM is helping clients and partners make smarter decisions about water management.
By monitoring, measuring and analyzing water systems, from rivers and reservoirs to pumps and pipes, we can better understand the issues around water. IBM is applying its expertise in smart systems and Big Data to help companies, governments and citizens understand and more effectively deal with these issues.
Waterfund LLC announced today that it has signed an agreement with IBM to develop a Water Cost Index (WCI).
Scientists from IBM Research will apply Big Data expertise, acting as a calculation agent, to analyze large and diverse unstructured data sets. This will be used to develop of a WCI framework that would estimate the cost of water in different regions around the world. With its market and financial product expertise, Waterfund will work to structure and commercialize the WCI.
Discerning The Real Cost Of Water
As governments are increasingly forced to turn to the private sector to fund the construction and maintenance of complex water networks, the Rickards Real Cost Water Index™ will serve as a benchmark for helping measure hundreds of critical projects on a like-for-like basis.
Index values will reflect estimated water production costs measured in US dollars per cubic metre for a variety of major global water infrastructure projects ranging from retail water utilities and wholesale water utilities to major transmission projects.
“The backlog of investment in water systems around the world by some estimates approaches $1 trillion – quite apart from the hundreds of millions of people who have never had access to a water or sanitation system at all,” said IBM Distinguished Engineer and Big Green Innovations CTO Peter Williams.
“By creating a benchmark cost for water we intend to harness the capital market to this supremely important cause. If we can make it easier to price investments in the water sector, we can improve the flow of capital into an area where it is desperately needed. We look forward to working with Waterfund to bring this about.”
Scott Rickards, President & CEO of Waterfund said, “The principal reason behind our decision to work with IBM was their unique combination of expertise in the water sector combined with the best data analytics available. Our initiative with IBM will finally bring real financial transparency to the water sector. By calculating the unsubsidized cost of freshwater production using IBM’s Big Data expertise, Waterfund can offer the first flexibly-tailored financial tools to investors in water infrastructure. The Rickards Real Cost Water Index™ highlights the energy costs, interest rate risk, and capital expenditures required to build and maintain large-scale water treatment and delivery networks.”
Smarter Water Management Examples
Typically, investors have turned to the public equity markets to gain exposure to the water sector, with mixed results. The WCI is intended to provide a market benchmark and to spur the development of third-generation financial products for both water producers and investors and to aid the growth of the water sector globally.
Here are two examples of how it would work:
Scenario 1: A Water Agency cannot obtain bank financing for Phase 2 of a seawater desalination plant project due to previous cost overruns on Phase
1. Yet the Agency lacks the water it needs to supply a contractually specified daily volume of water to its largest customer, with a consequent risk of large penalties for each day of insufficient volume. Using strike and trigger values based on the WCI, the Water Agency could purchase a $25 million, 2 year insurance product.
Payout to the Water Agency would be triggered on the total change in its Water Cost Index (as well as some other conditions, such as a specified increase in asset failure costs). This approach would enable the Water Agency to enhance its overall credit profile with the insurance enabled by the WCI, finance Phase 2 of the desalination plant and meet its supply obligations.
Scenario 2: A large desalination and water transmission system project needs to secure private equity and institutional funding alongside that from development banks and sovereign funds, to the tune of one third of the total project cost. To achieve this, the project needs a way to reduce risk to its investors.
Based on movement in the WCI, the project could purchase $50 million in insurance. This would enable the insurance product to then be underwritten by a large reinsurer and allow the project to secure the private sector contribution it needs in order to proceed.
Go here to learn more about IBM Smarter Water Management initiatives. You can also go here to register for a report IBM prepared on why we need smarter water management for our world’s most precious resource.
IBM To Acquired StoredIQ
IBM today announced it has entered into a definitive agreement to acquire StoredIQ Inc., a privately held company based in Austin, Texas.
Financial terms of the deal were not disclosed.
StoredIQ will advance IBM’s efforts to help clients derive value from big data and respond more efficiently to litigation and regulations, dispose of information that has outlived its purpose and lower data storage costs.
With this agreement, IBM adds to its prior investments in Information Lifecycle Governance. The addition of StoredIQ capabilities enables clients to find and use unstructured information of value, respond more efficiently to litigation and regulatory events and lower information costs as data ages.
IBM’s Information Lifecycle Governance suite improves information economics by helping companies lower the total cost of managing data while increasing the value derived from it by:
- Eliminating unnecessary cost and risk with defensible disposal of unneeded data
- Enabling businesses to realize the full value of information as it ages
- Aligning cost to the value of information
- Reducing information risk by automating privacy, e-discovery, and regulatory policies
Adding StoredIQ to IBM’s Information Lifecycle Governance suite gives organizations more effective governance of the vast majority of data, including efficient electronic discovery and its timely disposal, to eliminate unnecessary data that consumes infrastructure and elevates risk.
As a result, business leaders can access and analyze big data to gain insights for better decision-making. Legal teams can mitigate risk by meeting e-discovery obligations more effectively. Also, IT departments can dispose of unnecessary data and align information cost to value to take out excess costs.
What Does StoredIQ Software Do?
StoredIQ software provides scalable analysis and governance of disparate and distributed email as well as file shares and collaboration sites. This includes the ability to discover, analyze, monitor, retain, collect, de-duplicate and dispose of data.
In addition, StoredIQ can rapidly analyze high volumes of unstructured data and automatically dispose of files and emails in compliance with regulatory requirements.

StoredIQ brings powerful, innovative capabilities to govern data in place to drive value up and cost out.
“CIOs and general counsels are overwhelmed by volumes of information that exceed their budgets and their capacity to meet legal requirements,” said Deidre Paknad, vice president of Information Lifecycle Governance at IBM. “With this acquisition, IBM adds to its unique strengths as a provider able to help CIOs and attorneys rapidly drive out excess information cost and mitigate legal risks while improving information utility for the business.”
Named a 2012 Cool Vendor by Gartner, StoredIQ has more than 120 customers worldwide, including global leaders in financial services, healthcare, government, manufacturing and other sectors. Other systems require months to index data and years to configure, install and address information governance. StoredIQ can be up and running in just hours, immediately helping clients drive out cost and risk.
IBM intends to incorporate StoredIQ into its Software Group and its Information Lifecycle Governance business.
Building on prior acquisitions of PSS Systems in 2010 and Vivisimo in 2012, IBM adds to its strength in rapid discovery, effective governance and timely disposal of data. The acquisition of StoredIQ is subject to customary closing conditions and is expected to close in the first quarter of 2013.
Go here for more information on IBM’s Information Lifecycle Governance suite, and here for more information on IBM’s big data platform.
The Vindication Of Nate Silver
I was all set to write a closer examination of statistician and blogger Nate Silver’s most recent election predictions, a ramp up to during which he was lambasted by a garden variety of mostly conservative voices for either being politically biased, or establishing his predictions on a loose set of statistical shingles.
Only to be informed that one of my esteemed colleagues, David Pittman, had already written such a compendium post. So hey, why reinvent the Big Data prediction wheel?
Here’s a link to David’s fine post, which I encourage you to check out if you want to get a sense of how electoral predictions provide an excellent object lesson for the state of Big Data analysis. (David’s post also includes the on-camera interview that Scott Laningham and I conducted with Nate Silver just prior to his excellent keynote before the gathered IBM Information On Demand 2012 crowd.)
I’m also incorporating a handful of other stories I have run across that I think do a good job of helping people better understand the inflection point for data-driven forecasting that Silver’s recent endeavor represents, along with its broader impact in media and punditry.
They are as follows:
“Nate Silver’s Big Data Lessons for the Enterprise”
“What Nate Silver’s success says about the 4th and 5th estates”
“Election 2012: Has Nate Silver destroyed punditry?”
Nate Silver After the Election: The Verdict
As Forbes reporter wrote in his own post about Silver’s predictions, “the modelers are here to stay.”
Moving forward, I expect we’ll inevitably see an increased capability for organizations everywhere to adopt Silver’s methodical, Bayesian analytical strategies…and well beyond the political realm.
Live @ Information On Demand 2012: A Q&A With Nate Silver On The Promise Of Prediction
Day 3 at Information On Demand 2012.
The suggestion to “Think Big” continued, so Scott Laningham and I sat down very early this morning with Nate Silver, blogger and author of the now New York Times bestseller, “The Signal and the Noise” (You can read the review of the book in the Times here).
Nate, who is a youngish 34, has become our leading statistician through his innovative analyses of political polling, but made his original name by building a widely acclaimed baseball statistical analysis system called “PECOTA.”
Today, Nate runs the award-winning political website FiveThirtyEight.com, which is now published in The New York Times and which has made Nate the public face of statistical analysis and political forecasting.
In his book, the full title of which is “The Signal and The Noise: Why Most Predictions Fail — But Some Don’t,” Silver explores how data-based predictions underpin a growing sector of critical fields, from political polling to weather forecasting to the stock market to chess to the war on terror.
In the book, Nate poses some key questions, including what kind of predictions can we trust, and are the “predicters” using reliable methods? Also, what sorts of things can, and cannot, be predicted?
In our conversation in the greenroom just prior to his keynote at Information On Demand 2012 earlier today, Scott and I probed along a number of these vectors, asking Nate about the importance of prediction in Big Data, statistical influence on sports and player predictions (a la “Moneyball”), how large organizations can improve their predictive capabilities, and much more.
It was a refreshing and eye-opening interview, and I hope you enjoy watching it as much as Scott and I enjoyed conducting it!