AWS announced the availability of AWS Glue Data Quality, which delivers high-quality data across data lakes and pipelines. A vast number of users establish data lakes, but without data quality, these can transform into data swamps, according to AWS. Establishing data quality is an intricate and lengthy procedure. It necessitates manual scrutiny and the formulation … continue reading
The newly announced OpenAI Startup Fund is investing $100 million to partner with a small number of early-stage startups that are involved in fields that have a lot of potential for AI like health care, climate change and education. The companies in the fund will also get early access to future OpenAI systems, support from … continue reading
The Apache Software Foundation (ASF) announced that Apache Gobblin, the open-source distributed Big Data integration framework, has reached top-level project status. According to the foundation, achieving top-level status means that the project graduated from the Apache Incubator and has demonstrated that it’s community and products have been well-governed under the ASF’s meritocratic process and principles. … continue reading
This week’s featured open-source project is Apache Arrow Flight, a RPC framework for high-performance data services based on Arrow data. The project was co-developed by daka lake engine company Dremio, which recently added new support, and is built on top of gRPC and the IPC format. According to the team, Flight works by defining a … continue reading
The Workload Analyzer for Presto was open sourced this week by Varada, a data lake query acceleration innovator that aims to help data engineers gain holistic visibility into the performance of Presto clusters. Varada originally built the tool because it leverages the distributed SQL query engine Presto in its query acceleration engine Varada Data Platform. … continue reading
Cloudflare’s acquisition of Linc, the automation platform that helps front-end developers collaborate, will create seamless integration between Pages and Cloudflare Workers, a serverless execution environment. that allows users to create entirely new applications or augment. Linc offers a straightforward path to building end-to-end applications on Pages with both frontend and backend logic in one bundle. … continue reading
This latest release of JetBrains’ JavaScript IDE is packed with many long-awaited enhancements, including support for Tailwind CSS, the ability to sync one’s IDE theme with their OS settings, and Git staging. WebStorm 2020.3 also includes a new welcome screen, the ability to sync the IDE theme with your OS settings, improvements for working with … continue reading
The npm have released a new public roadmap. Developers can use the roadmap to learn more about the features that are being worked on, the stage that they’re in, as well as when they can be expected. They can also open a discussion and share suggestions for how the products should be improved and discuss … continue reading
Coming into force on May 25, 2018 is the long-awaited European General Data Protection Regulation (GDPR), which will change how businesses handle data on their customers and employees. In this ever-evolving world of data privacy, it’s important for companies to not only gain a strong understanding of GDPR, but understand where their data is located … continue reading
Alation and Paxata partner to provide consumers a better way to gather and analyze data Today, data management companies Alation and Paxata announced a new partnership aimed at providing simplification in establishing trust in data lakes. The partnership introduces a new “click-to-profile” data discovery feature where users can start discovering data using the Alation Data … continue reading
At the conference, MapR announced MapR Edge, a new solution to drive processing and analytics close to the edge. It captures, processes and analyzes data from Internet of Things devices and provides secure local processing, quick aggregation of insights, and the ability to push intelligence back to the edge. “Our customers have pioneered the use … continue reading
Data application infrastructure provider Concurrent has announced general availability of Cascading 3.0. The latest version of the enterprise Hadoop application development and deployment platform improves portability across programming languages such as Java, Scala and SQL, and across Hadoop distributions such as Cloudera, Hortonworks, MapR. And it now has native support for compute fabrics like Apache … continue reading