Hadoop is gaining quite a few friends this fall. Only a month ago, the R analytics language was ported to work with Hadoop; today, Cloudera announced that Informatica will support Hadoop, starting in the first quarter of 2011.
Ed Albanese, head of business development for Cloudera, said that Informatica brings a large stable of developers to Hadoop. “Informatica has really broad reach. They have over 4,000 customers and a substantial developer community,” he said.
“Accenture has trained over 3,000 consultants that speak Informatica. From a Cloudera standpoint, anyone who speaks Informatica can now speak Hadoop. The user interfaces people use for tools and pipelines in Informatica can be done in Hadoop.”
Working together, Cloudera and Informatica are building roads for both data ingress and egress to Hadoop. Once the information has been pushed into Hadoop, Informatica will offer a layer of software that translates Informatica code into Pig (the standard query language for generating Hadoop jobs), and the translation layer will allow Informatica developers to write their own Hadoop jobs without having to learn Pig.
“Informatica is a data integration company that moves data from point A to point B,” said Albanese. “One of the new targets can be Hadoop. If data from log files is loaded into Hadoop, Informatica can grab that data and run it through a transformation job. It can also grab data from different systems, like SAP and Salesforce.com. We become both a source and a target.”
Informatica plans to release these integrations and connectors in the first quarter of 2011. They will be used specifically with Cloudera’s disk image-based Hadoop distributions, and users of Cloudera’s enterprise Hadoop distribution will be able to buy support for the Informatica connectors directly from Informatica.