Topic: data lake

AWS Glue Data Quality delivers high-quality data across data lakes and pipelines

AWS announced the availability of AWS Glue Data Quality, which delivers high-quality data across data lakes and pipelines.  A vast number of users establish data lakes, but without data quality, these can transform into data swamps, according to AWS. Establishing data quality is an intricate and lengthy procedure. It necessitates manual scrutiny and the formulation … continue reading

SD Times news digest: OpenAI Startup Fund, JFrog Private Distribution Network, and Databricks Data Live Tables and Unity Catalog

The newly announced OpenAI Startup Fund is investing $100 million to partner with a small number of early-stage startups that are involved in fields that have a lot of potential for AI like health care, climate change and education. The companies in the fund will also get early access to future OpenAI systems, support from … continue reading

Apache Gobblin now top-level project

The Apache Software Foundation (ASF) announced that Apache Gobblin, the open-source distributed Big Data integration framework, has reached top-level project status. According to the foundation, achieving top-level status means that the project graduated from the Apache Incubator and has demonstrated that it’s community and products have been well-governed under the ASF’s meritocratic process and principles. … continue reading

SD Times Open-Source Project of the Week: Apache Arrow Flight

This week’s featured open-source project is Apache Arrow Flight, a RPC framework for high-performance data services based on Arrow data. The project was co-developed by daka lake engine company Dremio, which recently added new support, and is built on top of gRPC and the IPC format.  According to the team, Flight works by defining a … continue reading

SD Times Open-Source Project of the Week: Workload Analyzer for Presto

The Workload Analyzer for Presto was open sourced this week by Varada, a data lake query acceleration innovator that aims to help data engineers gain holistic visibility into the performance of Presto clusters. Varada originally built the tool because it leverages the distributed SQL query engine Presto in its query acceleration engine Varada Data Platform.  … continue reading

SD Times news digest: Cloudflare acquires Linc, Amazon launches AWS Glue custom connectors, ThreatStack now available for Ruby Gems and NPM

Cloudflare’s acquisition of Linc, the automation platform that helps front-end developers collaborate, will create seamless integration between Pages and Cloudflare Workers, a serverless execution environment. that allows users to create entirely new applications or augment. Linc offers a straightforward path to building end-to-end applications on Pages with both frontend and backend logic in one bundle. … continue reading

SD Times news digest: JetBrains WebStorm 2020.3, Instana Enterprise Observability for Microservices now available on AWS, Informatica’s new data lake management solution

This latest release of JetBrains’ JavaScript IDE is packed with many long-awaited enhancements, including support for Tailwind CSS, the ability to sync one’s IDE theme with their OS settings, and Git staging. WebStorm 2020.3 also includes a new welcome screen, the ability to sync the IDE theme with your OS settings, improvements for working with … continue reading

SD Times news digest: npm public roadmap, TIBCO to acquire Information Builders, and HackerOne expands integrations ecosystems

The npm have released a new public roadmap. Developers can use the roadmap to learn more about the features that are being worked on, the stage that they’re in, as well as when they can be expected.  They can also open a discussion and share suggestions for how the products should be improved and discuss … continue reading

How to prepare for the General Data Protection Regulation

Coming into force on May 25, 2018 is the long-awaited European General Data Protection Regulation (GDPR), which will change how businesses handle data on their customers and employees. In this ever-evolving world of data privacy, it’s important for companies to not only gain a strong understanding of GDPR, but understand where their data is located … continue reading

Alation and Paxata join forces, RightMesh opens application for network protocol SDK and more — SD Times New Digest: September 28, 2017

Alation and Paxata partner to provide consumers a better way to gather and analyze data Today, data management companies Alation and Paxata announced a new partnership aimed at providing simplification in establishing trust in data lakes. The partnership introduces a new “click-to-profile” data discovery feature where users can start discovering data using the Alation Data … continue reading

Strata + Hadoop World: MapR Edge, Zaloni Data Lake in a Box, and Dell EMC Ready Bundle for Hortonworks Hadoop

At the conference, MapR announced MapR Edge, a new solution to drive processing and analytics close to the edge. It captures, processes and analyzes data from Internet of Things devices and provides secure local processing, quick aggregation of insights, and the ability to push intelligence back to the edge. “Our customers have pioneered the use … continue reading

Hadoop Summit: Cascading 3.0, DgSecure 5.0 and MapR’s Azure integration

Data application infrastructure provider Concurrent has announced general availability of Cascading 3.0. The latest version of the enterprise Hadoop application development and deployment platform improves portability across programming languages such as Java, Scala and SQL, and across Hadoop distributions such as Cloudera, Hortonworks, MapR. And it now has native support for compute fabrics like Apache … continue reading

DMCA.com Protection Status