Cloudera releases Cloudera Enterprise 5.2, Impala 2.0, Cloudera Director product and Cloudera Accelerator Program
Cloudera used the Strata stage to unveil a slew of new products, programs and updates.

The Apache Hadoop software provider released version 5.2 of Cloudera Enterprise, its data analytics management solution for Hadoop, with several security advancements including simple key management, enhanced auditing and component coverage in Cloudera Navigator and sentry policy management support in the open-source Hadoop UI.

Cloudera also announced a new product and programs, the Cloudera Director self-service platform for managing enterprise cloud deployments and two new programs real-time streaming innovation—the Cloudera Accelerator Program and Cloudera Labs.

Cloudera Director extends the company’s enterprise data hub architecture to the cloud, allowing for self-service provisioning through a simple interface and foundational support for hybrid deployments across multiple cloud environments. The Cloudera Accelerator Program will work with partners to further advance real-time streaming architectures such as the Apache Spark framework, while Cloudera Labs will serve as a virtual incubator for open-source initiatives the Cloudera engineering team is contributing to, such as the Apache Kafka fault-tolerant messaging system.

Finally, Cloudera announced version 2.0 of its Impala open-source analytics database running natively in Hadoop. Impala is a core component of Cloudera Enterprise 5.2, and the 2.0 release bolsters the platform with SQL 2003 support for standards-based analytics, role-based access controls and Apache Sentry integration, legacy data type migration and vendor-specific SQL extensions.

Hortonworks Data Platform 2.2 released
Popular Hadoop distribution Hortonworks announced version 2.2 of the Hortonworks Data Platform, the company’s enterprise data platform for the Hadoop YARN subproject.

HDP 2.2 adds more than 100 new features that integrate with YARN to enable batch, interactive and real-time methods of interacting with a single set of Hadoop data.

  • A YARN-ready Apache Spark engine engine for data science and an Apache Kafka engine for Internet of Things data processing.
  • Enterprise SQL at Hadoop scale with the initiative, adding updated SQL semantics for ACID transactions in Apache Hive, a cost-based optimizer for better SQL query performance and ORC file compression.
  • Apache Argus for centralized security administration and policy enforcement, integrated with Apache Storm and Samsung Knox with the ability to enforce policy with Hive and HBase.
  • Management and monitoring improvements: 100% uptime target with cluster rolling upgrades, Ambari Views for custom visualization and Ambari Blueprints to deliver template cluster deployment.
  • Automated cluster backup to the cloud for Microsoft® Azure and Amazon S3.

An HDP 2.2 preview is currently available, and the release will be generally available in November.

MongoDB announces enhancements to MongoDB Management Service

MongoDB is rolling out upgrades to its database management service, MMS, to improve MongoDB provisioning, monitoring, backup and scaling.
The popular cross-platform NoSQL database claims the revamped MongoDB Management Service reduces operational overhead by up to 95% for any size deployment. The key enhanced elements of MMS that MongoDB highlighted include:

  • Advanced AWS Integration: MMS can provision and optimize Amazon Web Services instances for MongoDB automatically.
  • Upgrades: MMS manages upgrades and downgrades of deployments in minutes, with no downtime.
  • Scale Out: Users can rapidly scale deployments, adding capacity without taking the application offline.
  • Infrastructure Agnostic: MMS works with any internet-connected infrastructure including public or private clouds and laptops, controlled through a single interface.
  • Continuous Backups: MMS backs up deployments continuously, seconds behind the production database, without impacting overhead.
  • Point-in-time Recovery: Users can restore deployments to any point in time.
  • Performance Alerts: Users can be notified on custom alerts for over 100 system metrics, via email, SMS, PagerDuty, HipChat, and others services.

Pentaho announces Data Refinery Blueprint for automated data modeling
Enterprise Big Data analytics platform Pentaho announced a new architecture blueprint for Streamlined Data Refinery, a design pattern for orchestrating blended data sets for on-demand Hadoop queries.

According to Pentaho, a Streamlined Data Refinery solution can expand automated business user capabilities through secure, blended and on-demand analytics. The blueprint launch supports Streamlined Data Refinery architectures by automating the modeling process and publishing data large-scale analytical databases such as HP Vertica while still meeting IT requirements.

Predixion Software releases Predixion Insight 4.0
Cloud-based predictive analytics software provider Predixion Software announced the latest version of its predictive analytics platform, Predixion Insight 4.0. The release expands the platform’s predictive analytics capabilities across applications, databases, data stores, real-time engines, devices and machines.

New features and support in Predixion Insight 4.0 include:

  • Deployment of scripts and packages created with other predictive modeling tools by leveraging machine learning libraries, statistical programming languages such as R and Maht, and PMML integration support.
  • Combination of structured and unstructured data from multiple sources.
  • Providing visualizations and summaries with immediate feedback.
  • A single portable object called an MLSM (Machine Language Semantic Model) package containing all transformations and analytics for deployment anywhere.
  • Solution Accelerator, a framework for rapid creation of custom web-based predictive application.

Attunity Replicate 4.0
Attunity has released Attunity Replicate 4.0, the latest version of its Big Data replication solution for Hadoop.

The Big Data management and distribution provider aims to reduce time, labor and Hadoop implementation costs with version 4.0, adding high performance data loading and extractions for Hadoop to Attunity Replicate with optimized processes and APIs. Other new features include drag-and-drop data configuration, a Web-based performance metrics dashboard and certification with top Hadoop distributions including Hortonworks and Cloudera.

GraphLab Create 1.0 now generally available
GraphLab, a high-performance distributed computation startup behind the parallel machine learning C++ framework, announced the general availability of GraphLab Create 1.0.

New features added in the GraphLab Create 1.0 release include the ability to build predictive, scalable applications deployed on AWS and queried in real-time with a RESTful API. The release also adds expanded Deep Learning, Boosted Tree algorithm and dashboard visualization capabilities, along with a new Auto-tuning Toolkits API that automatically selects a machine learning model for enterprises.

GraphLab also marked the Create 1.0 release with Hadoop, Apache Spark and Apache Avro integrations.