The Schwartz
Cloud Report

Blog archive

IBM Wrestles Big Data with Cloudera Pact, Buys Vivisimo

Looking to shore up its big data strategy, IBM has formed a partnership with Cloudera, provider of the widely deployed distribution of Hadoop, the open-source data management platform for searching and analyzing structured and unstructured data distributed across large commodity compute and storage infrastructures.

IBM also announced Wednesday it has agreed to acquire Vivisimo, whose federated search and navigation software is used to analyze big data distributed across an enterprise. Terms were not disclosed. IBM made the two announcements together as it seeks to stake its position as a key player in the hotly contested market of big data analytics providers.

Big Blue is gunning to extend its analytics leadership beyond traditional data warehouses and marts, as it looks to let users analyze petabytes of data generated from content management repositories and less traditional sources including social media. IBM's rivals, including Oracle, EMC, Hewlett-Packard, Teradata, Microsoft and SAP as well as upstarts like Splunk, Hortonworks and even Cloudera are investing heavily in offering big data solutions.

IBM said it will integrate the Cloudera Distribution of Hadoop (CDH) and Cloudera Manager with the IBM open source big data platform, called InfoSphere BigInsights, available both as software on premises and in the cloud. The goal is to enable IBM customers and partners to run BigInsights within CDH and Cloud Manager and allow them to build on top of it.

"Our intention is not to compete directly with Cloudera, it's to build value on top," said David Corrigan, IBM's director of information management strategy. "We struck this partnership with Cloudera and ensured our BigInsights components, advanced components for workload optimization or development environments, would actually leverage the Cloudera open source distribution instead of our own."

Corrigan said Cloudera customers can run IBM's analytics tools, enabling them to extend and take advantage of IBM's big data capabilities, which include enterprise integration, support for real-time analysis of streamed data and extended parallel processing and workload management.

Meanwhile, IBM's move to acquire Pittsburgh-based Vivisimo gives it software that enables federated discovery and navigation of big data. The software allows users to search structured and unstructured data -- typically massive amounts of content -- across disparate systems and allows for the analysis of the data without moving it, Corrigan explained. Instead of moving the data, the software indexes it, assigns relevance and lets users analyze it. "You don't have to move the data into a new location to get a sense of its value," he said. "Leave it in place, therefore it's a faster time to value."

Forrester analyst Boris Evelson said in a blog post data discovery is an important but first step in the BI and analytics cycle. "Once you discover a pattern using a product like Vivisimo, you may need to productionalize or persist your findings in a traditional DW, and then build reports and dashboards for further analysis using traditional BI technologies."

Posted by Jeffrey Schwartz on April 26, 2012