Data is the information that drives business. It can be structured in rows and columns, like a customer name, address, and phone; and it can be unstructured, such as an email or a social media post. Structured data is what is populated in Relational Database Management Systems such as those created by Oracle, IBM and Microsoft, and open-source PostgreSQL and MySQL, among others. That data can be accessed using the standard Structured Query Language (SQL). Unstructured data resides in what are called NoSQL databases, such as Cassandra, Couchbase, MongoDB and many, many others. Many organizations today run both kinds of databases.
Once the data is stored, it must be easily retrievable, found amid the mountains of data organizations collect, and made available at scale. Numerous tools exist for those jobs, including Hadoop, Apache Spark and many more. It is through the collection and analysis of data that businesses can make decisions that affect their bottom line.
GitHub released the limited beta of GitHub Package Registry, a package management service that makes it easy to publish public or private packages next to your source code. Pricing details will be announced soon. The service is fully integrated with GitHub and it provides fast, reliable downloads backed by GitHub’s global CDN. It also supports … continue reading
Android Q is getting new security features that include encryption, platform hardening and authentication. In the Q release, the company is launching Adiantum, designed to run efficiently without cryptographic acceleration hardware, and to work across everything from smart watches to internet-connected medical devices. Now, all compatible Android devices newly launching with Android Q are required … continue reading
Scalyr, a provider of log management and observability, released a new set of data operations within Scalyr called PowerQueries. Utilizing Scalyr’s real-time processing engine, PowerQueries lets users switch between facet-based search and complex log search operations for complicated data sets, such as grouping, transformations, filtering and sorting, table lookups and joins. “We received a lot … continue reading
Databricks today announced Delta Lake, an open-source project designed to bring reliability to data lakes for both batch and streaming data. The project was revealed during the Spark +AI Summit taking place this week in San Francisco. Data lakes are used as repositories for structured and unstructured data, but factors such as failed writes, schema … continue reading
Data warehousing is one of the core sources of enterprise information, but most organizations are still unable to unlock the potential value of their investments. For one thing, traditional data warehouses require significant domain experience and manually-configured rules to enable the extraction of useful data. Modern data warehouses add machine learning, AI and deep learning … continue reading
Ascend empowers everyone to create smarter products. Ascend provides a fully-managed platform for data analysts, data scientists, and analytics/BI engineers to create Autonomous Data Pipelines that fuel analytics and machine learning applications. Leveraging the platform, these teams can collaborate and adopt DataOps best practices as they self-serve and iterate with data and create reusable, self-healing … continue reading
After watching application teams, security teams and operations teams get the -Ops treatment, data engineering teams are now getting their own process ending in -Ops. While still in its very early days, data engineers are beginning to embrace DataOps practices. Gartner defines DataOps as “a collaborative data manager practice, really focused on improving communication, integration, … continue reading
Mozilla is releasing an experimental scientist communication and exploration tool for the web. Iodide enables data scientists to create, share, collaborate and reproduce reports and visualizations on the web with familiar tools. According to the company, while the data and scientific computing world is exploding, there has been little work done to give scientists access … continue reading
The upcoming release of Google’s operating system was unveiled this week, and the company is now giving insights into how it will handle location data and what that will mean for developers. According to the company, while location data can be imperative to giving users recommendations based on where they are, it is also a … continue reading
NVIDIA is expanding its mark in the processing and high performance computing industry with the acquisition of Mellanox. The company is acquiring Mellanox for $125 per share, approximately $6.9 billion total. According to NVIDIA, the acquisition will help tackle the growing data and compute load necessary for AI, scientific computing and data analytics. Together, NVIDIA’s … continue reading
There are a number of published datasets available on the web for developers and researchers to take advantage of, experiment with, and build interesting solutions from. However, just because a dataset is open and available doesn’t mean it will necessarily be useful. To make data more accessible and beneficial to the industry, Google has committed … continue reading
TIBCO has announced that it has acquired in-memory data platform SnappyData. According to TIBCO, the acquisition will complement its Connected Intelligence platform by introducing a unified analytics data fabric that will enhance “analytics, data science, streaming, and data management for various use cases requiring speed, volume, and agility.” “This acquisition aligns with our long-standing commitment … continue reading