Topic: data

Hitachi to acquire Big Data analytics company Pentaho

Japanese enterprise technology company Hitachi has announced its plans to acquire Big Data analytics company Pentaho. Hitachi will add Pentaho’s open-source Big Data deployment platform to its portfolio, eventually merging Pentaho’s Big Data analysis and machine-learning capabilities into its Hitachi Data Systems (HDS) products. According to Hitachi’s announcement, the company’s plans for a shared analytics … continue reading


Data Governance Initiative expands the Hadoop ecosystem

Hadoop has, for the most part, moved beyond the proof-of-concept phase and the initial chasm of adoption. More and more organizations are putting the open-source framework to work on mountains of complex Big Data. The next step in Hadoop’s evolution is getting a handle on governance. To that end, Hortonworks—the enterprise data platform provider and … continue reading

Apache Falcon graduates to Top-Level Project, Canonical announces Ubuntu Core for IoT, Facebook open-sources AI tools for Torch—SD Times news digest: Jan. 20, 2015

The Apache Software Foundation has announced that Apache Falcon has graduated from the Apache Incubator to a Top-Level Project. Falcon is a data processing and management solution for Apache Hadoop with a focus on data motion, data discovery, coordination of data pipelines, and life-cycle management. “Apache Falcon solves a very important and critical problem in … continue reading

Inside the Apache Software Foundation’s Flink

The Apache Software Foundation has announced Apache Flink as a Top-Level Project (TLP). Flink is an open-source Big Data system that fuses processing and analysis of both batch and streaming data. The data-processing engine, which offers APIs in Java and Scala as well as specialized APIs for graph processing, is presented as an alternative to … continue reading

Google open-sources Cloud Dataflow SDK, proposes marking HTTP as non-secure, adds feature support to Dart

Google has announced the open-source availability of the Cloud Dataflow SDK, allowing developers to integrate their apps with the Dataflow-managed data processing service. Google software engineer Sam McVeety made the announcement in a blog post detailing how developers now have the capability to begin porting Dataflow to other languages and execution environments, and they can … continue reading

Zeichick’s Take: Is the best place for data on-prem or in the cloud? Ask your lawyer

Cloud-based storage is amazing. Simply amazing. That’s especially true when you are talking about data from end users that are accessing your applications via the public Internet. If you store data in your local data center, you have the best control over it. You can place it close to your application servers. You can amortize … continue reading

NSA releases automated data flow software as open source

The National Security Agency has released the first in a series of software products by its Technology Transfer Program (TTP) to the open-source community. Niagarafiles, also known as NiFi, automates high-volume data flows among computer networks, even if data formats and protocols differ. The technology “provides a way to prioritize data flows more effectively and … continue reading

Industry Watch: Big Data: Now what?

Organizations today get that they have to collect data to stay competitive. They understand how to store it, retrieve it and slice it. The idea now is to understand the data itself, to detect patterns and trends that will help the organization get new customers or members, service them more personally and engage with them … continue reading

Snowflake offers cloud data warehouse as a service, cheaply

Snowflake has appeared in time for winter. This cloud data warehousing and analytics company revealed its product for the first time this week, coming out after nearly two years of working in stealth mode. Bob Muglia, CEO of Snowflake Software, is a former Microsoft executive who joined Snowflake in June. He said the sweet spot … continue reading

SD Times news digest: October 8, 2014—GitHub’s Student Developer Pack, IBM releases Watson APIs, Facebook’s open-source Chef tools

The GitHub Student Developer Pack GitHub has partnered with a host of commercial and open-source platforms to release the GitHub Student Developer Pack. The developer pack provides students with free access to developer tools. “There’s no substitute for hands-on experience, but for most students, real world tools can be cost prohibitive,” John Britton, education liaison … continue reading

Espresso Logic brings NoSQL and SQL data together into a single API

One of the biggest problems developers face when building data-driven apps is having to access data from multiple data sources, according to R. Paul Singh, CEO of Espresso Logic. “What we are seeing and hearing is a lot of customers’ data isn’t in SQL only; they also have it in NoSQL databases,” he said. “Having … continue reading

Zeichick’s Take: Look to the intranet

Where do your employees go to find shared data? If it’s external data, probably an external search engine, like Google (which apparently holds 67.6% of the U.S. market) or Bing (18.7%) or one of the niche players. What about internal data? If your organization uses a platform like Microsoft’s SharePoint, that platform includes a pretty … continue reading Protection Status