Google has announced a partnership with Cloudera to bring its Dataflow programming model into Apache Spark. According to the company, developers need a powerful, flexible and easy-to-use programming model to stay productive, and the Dataflow model provides maximum productivity and seamlessly portability. Dataflow currently offers a direct pipeline runner, a Google Cloud Dataflow runner, and … continue reading
MapR Technologies, Inc., provider of the top-ranked distribution for Apache Hadoop, today announced an initiative to integrate Apache Drill, which provides instant, self-service data exploration across multiple data sources, with Apache Spark, the in-memory processing framework that provides speed, programming ease and real-time processing advantages. “The MapR initiative to integrate Apache Drill with Apache Spark’s … continue reading
Developers can now become Apache Spark certified. Databricks, the company that founded the open-source Big Data processing engine, and O’Reilly Media are teaming up to launch the Apache Spark Developer Certification Program. Spark made strides earlier this year when the Apache Software Foundation moved it from Apache Incubator to a top-level project. “The adoption of … continue reading
Version 1 features strong API stability guarantees, a new Spark SQL component, and extended Java and Python support … continue reading
Open-source cluster computing framework for fast and flexible large-scale data analysis gains support … continue reading