Data is the information that drives business. It can be structured in rows and columns, like a customer name, address, and phone; and it can be unstructured, such as an email or a social media post. Structured data is what is populated in Relational Database Management Systems such as those created by Oracle, IBM and Microsoft, and open-source PostgreSQL and MySQL, among others. That data can be accessed using the standard Structured Query Language (SQL). Unstructured data resides in what are called NoSQL databases, such as Cassandra, Couchbase, MongoDB and many, many others. Many organizations today run both kinds of databases.
Once the data is stored, it must be easily retrievable, found amid the mountains of data organizations collect, and made available at scale. Numerous tools exist for those jobs, including Hadoop, Apache Spark and many more. It is through the collection and analysis of data that businesses can make decisions that affect their bottom line.
Frances “Fran” Allen, the first female IBM Fellow and the first woman to win the Turing Award, died on August 4, 2020, the day of her 88th birthday. She was a pioneer in compiler organization and optimization algorithms and made seminal contributions to the world of computing. Her work on inter-procedural analysis and automatic parallelization … continue reading
With the explosion of data from today’s increasingly complex systems, the issue of data connectivity becomes more important than ever. CData, which has grown as a provider of drivers for data connectivity, is positioned for growth as a data connectivity platform — just without the platform. Instead, CData is focused on bringing data connectivity capabilities … continue reading
Yellowbrick Data and Protegrity are teaming up to provide advanced data security and privacy solutions. “Protegrity delivers leading-edge data security and privacy solutions to the world’s largest enterprises across the leading platforms and data stores,” said Allen Holmes, vice president of business development at Yellowbrick Data. “Combined with the power and scale of Yellowbrick’s hybrid … continue reading
The Apache Arrow team has announced the release of Apache Arrow 1.0.0. Apache Arrow is a development platform for in-memory analytics. Version 1.0.0 is the 18th release of the platform. It features 810 resolved issues from 100 contributors. According to the team, this release marks a transition to binary stability of the columnar format and … continue reading
Cloudflare has unveiled a new serverless solution to compete with AWS Lambda. The release of Cloudflare Workers Unbound offers a serverless platform for developers to run complicated computing workloads across the Cloudflare network and pay only for what they use. According to the company, the new solution can save users up to 75% for the … continue reading
The Xen Project announced the latest version of its open-source hypervisor. Xen Project Hypervisor 4.14 introduces Linux subdomains, better nested performance, more robust live patching and reflects contributions from across the community and ecosystem. A new development made in the Xen Project Functional Safety Working group is the successful drafting of prototype requirement documents and … continue reading
Perforce Software announced that it acquired Methodics, a provider of intellectual property life cycle management and traceability solutions for enterprises. Perforce explained the acquisition will help it expand its DevOps portfolio. “The semiconductor and embedded software design markets continue to expand, especially as they serve growing AI, automotive, cloud, and IoT markets,” said Mark Ties, … continue reading
Apache APISIX, the cloud-native API gateway used to handle interface traffic for web, mobile, and IoT applications, just reached Top-Level Project status at the Apache Software Foundation. Apache APISIX is based on Nginx and etcd. “Thanks to the help of our mentors, contributors and the Apache Incubator, Apache APISIX has now graduated as a Top-Level … continue reading
This week’s featured open-source project is Lumos, a Python library built to compare metrics between two datasets, accounting for population differences and invariant features. Lumos was open sourced this month by Microsoft. In a technical paper that shows the results from a real-world deployment of Lumos in Microsoft RTC applications , the Microsoft team wrote: … continue reading
MobileIron announced multi-vector mobile phishing protection for iOS and Android devices to help organizations defend against the top cybersecurity threats. The solution offers on-device and cloud-based phishing URL database lookup to detect and remediate phishing attacks across all mobile threat vectors, including text and SMS messages, instant messages, social media and other modes of communication, … continue reading
Microsoft and the OpenDP Initiative at Harvard have collaborated on a new platform that will offer differential privacy for large datasets. Differential privacy allows researchers to analyze datasets without having important data withheld, while also preserving the privacy of that data, according to Microsoft. “Differential privacy, the heart of today’s landmark milestone, was invented at … continue reading
MLflow, the open-source machine learning platform created by Databricks, has joined the Linux Foundation. The version update MLflow 1.9.1 was also released this week with bug fixes and improvements. The project has seen more than two million downloads per month and is growing fourfold every year. The project was first introduced at Spark + AI … continue reading