Topic: hdfs

Apache Kudu becomes top-level project

The Apache Kudu Project is, as of today, a top-level project within the open-source technology foundation. Originally contributed by Cloudera, the project is an effort to build a highly efficient and fast analytics platform for quickly moving data, such as streams. Kudu, in practice, is actually a columnar storage manager for Hadoop. The system is … continue reading

Apache Apex reaches top level

The Apache Apex project has been promoted from the Apache Incubator to becoming top-level project as of today. This open-source stream- and batch-processing platform works with YARN and HDFS, runs in memory, and can handle event processing and fault tolerance. Apex started out as the real-time streaming core of DataTorrent. The company contributed its platform … continue reading

Is Spark replacing Hadoop?

The Apache Hadoop project took off in enterprises over a fairly short period of time. Four or five years ago, Hadoop was just becoming a “thing” for enterprise data processing and experimentation. MapReduce was at the heart of that thing, and Spark was still only a research project at the University of California at Berkeley. … continue reading

Arun Murthy discusses the future of Hadoop

Arun Murthy is a busy fellow. When he’s not acting as architect at Hortonworks, the Hadoop company he founded, he’s flying around the world giving keynote addresses. This is quite a long ways from where he was 10 years ago, working on Hadoop inside Yahoo. But then, the future is, typically, uncertain. That’s why we … continue reading

Inside the Apache Software Foundation’s Flink

The Apache Software Foundation has announced Apache Flink as a Top-Level Project (TLP). Flink is an open-source Big Data system that fuses processing and analysis of both batch and streaming data. The data-processing engine, which offers APIs in Java and Scala as well as specialized APIs for graph processing, is presented as an alternative to … continue reading

Plugging in to Hadoop

Even if it’s not where they end up, Hadoop can be a great starting platform for a data-driven software company. That’s what San Francisco- and Taipei-based Fliptop found in 2009 when they launched a social media identity matching engine, ultimately employed by such companies as MailChimp, Dell, Toyota, Oracle and Nordstrom. Based on Amazon Web … continue reading

Hadoop is now a general-purpose platform

Skepticism is giving way once developers see what the latest version of Hadoop can do … continue reading

Doug Cutting: Why Hadoop is still No. 1

How Hadoop become the de facto standard, and what it plans on doing next … continue reading

Hadoop 2.0 comes, bringing YARN

New version of Big Data analysis platform includes highly available file system and unshackles platform from Map/Reduce … continue reading

DMCA.com Protection Status