Topic: data lake

Apache Gobblin now top-level project

The Apache Software Foundation (ASF) announced that Apache Gobblin, the open-source distributed Big Data integration framework, has reached top-level project status. According to the foundation, achieving top-level status means that the project graduated from the Apache Incubator and has demonstrated that it’s community and products have been well-governed under the ASF’s meritocratic process and principles. … continue reading

SD Times Open-Source Project of the Week: Apache Arrow Flight

This week’s featured open-source project is Apache Arrow Flight, a RPC framework for high-performance data services based on Arrow data. The project was co-developed by daka lake engine company Dremio, which recently added new support, and is built on top of gRPC and the IPC format.  According to the team, Flight works by defining a … continue reading

SD Times Open-Source Project of the Week: Workload Analyzer for Presto

The Workload Analyzer for Presto was open sourced this week by Varada, a data lake query acceleration innovator that aims to help data engineers gain holistic visibility into the performance of Presto clusters. Varada originally built the tool because it leverages the distributed SQL query engine Presto in its query acceleration engine Varada Data Platform.  … continue reading

SD Times news digest: Cloudflare acquires Linc, Amazon launches AWS Glue custom connectors, ThreatStack now available for Ruby Gems and NPM

Cloudflare’s acquisition of Linc, the automation platform that helps front-end developers collaborate, will create seamless integration between Pages and Cloudflare Workers, a serverless execution environment. that allows users to create entirely new applications or augment. Linc offers a straightforward path to building end-to-end applications on Pages with both frontend and backend logic in one bundle. … continue reading

SD Times news digest: JetBrains WebStorm 2020.3, Instana Enterprise Observability for Microservices now available on AWS, Informatica’s new data lake management solution

This latest release of JetBrains’ JavaScript IDE is packed with many long-awaited enhancements, including support for Tailwind CSS, the ability to sync one’s IDE theme with their OS settings, and Git staging. WebStorm 2020.3 also includes a new welcome screen, the ability to sync the IDE theme with your OS settings, improvements for working with … continue reading

SD Times news digest: npm public roadmap, TIBCO to acquire Information Builders, and HackerOne expands integrations ecosystems

The npm have released a new public roadmap. Developers can use the roadmap to learn more about the features that are being worked on, the stage that they’re in, as well as when they can be expected.  They can also open a discussion and share suggestions for how the products should be improved and discuss … continue reading

How to prepare for the General Data Protection Regulation

Coming into force on May 25, 2018 is the long-awaited European General Data Protection Regulation (GDPR), which will change how businesses handle data on their customers and employees. In this ever-evolving world of data privacy, it’s important for companies to not only gain a strong understanding of GDPR, but understand where their data is located … continue reading

Alation and Paxata join forces, RightMesh opens application for network protocol SDK and more — SD Times New Digest: September 28, 2017

Alation and Paxata partner to provide consumers a better way to gather and analyze data Today, data management companies Alation and Paxata announced a new partnership aimed at providing simplification in establishing trust in data lakes. The partnership introduces a new “click-to-profile” data discovery feature where users can start discovering data using the Alation Data … continue reading

Strata + Hadoop World: MapR Edge, Zaloni Data Lake in a Box, and Dell EMC Ready Bundle for Hortonworks Hadoop

At the conference, MapR announced MapR Edge, a new solution to drive processing and analytics close to the edge. It captures, processes and analyzes data from Internet of Things devices and provides secure local processing, quick aggregation of insights, and the ability to push intelligence back to the edge. “Our customers have pioneered the use … continue reading

Hadoop Summit: Cascading 3.0, DgSecure 5.0 and MapR’s Azure integration

Data application infrastructure provider Concurrent has announced general availability of Cascading 3.0. The latest version of the enterprise Hadoop application development and deployment platform improves portability across programming languages such as Java, Scala and SQL, and across Hadoop distributions such as Cloudera, Hortonworks, MapR. And it now has native support for compute fabrics like Apache … continue reading

Hadoop Summit roundup: MapR 5.0, HDP 2.3, Pentaho 5.4 and more Big Data news

MapR announced the release of MapR 5.0, along with new auto-provisioning templates for data lake deployment, interactive SQL data exploration, and operational analytics at Hadoop Summit. Version 5.0 of the MapR Hadoop distribution adds a new Views feature for the newly released Apache Drill 1.1 for agile data governance, and granular access controls for better … continue reading

Hadoop and beyond: A primer on Big Data for the little guy

Have you heard the news? A “data lake” overflowing with information about Hadoop and other tools, data science and more threatens to drown IT shops. What’s worse, some Big Data efforts may fail to stay afloat if they don’t prove their worth early on. “Here’s a credible angle on why Big Data could implode,” began … continue reading

Ad will close in seconds
Continue to site
HTML Snippets Powered By : XYZScripts.com

Get access to this and other exclusive articles for FREE!

There's no charge and it only takes a few seconds.

Sign up now!