Data is the information that drives business. It can be structured in rows and columns, like a customer name, address, and phone; and it can be unstructured, such as an email or a social media post. Structured data is what is populated in Relational Database Management Systems such as those created by Oracle, IBM and Microsoft, and open-source PostgreSQL and MySQL, among others. That data can be accessed using the standard Structured Query Language (SQL). Unstructured data resides in what are called NoSQL databases, such as Cassandra, Couchbase, MongoDB and many, many others. Many organizations today run both kinds of databases.
Once the data is stored, it must be easily retrievable, found amid the mountains of data organizations collect, and made available at scale. Numerous tools exist for those jobs, including Hadoop, Apache Spark and many more. It is through the collection and analysis of data that businesses can make decisions that affect their bottom line.
Voltron Data has announced the release of Ibis 8.0, an update to its popular Python dataframe API, which has been downloaded over 10 million times. Ibis enables developers to run code across various data platforms by choosing the most suitable query engine for specific tasks. The latest version introduces the first dedicated streaming backends for … continue reading
Entity resolution — the process of determining when data records are about the same person, organization or other entity, despite differences in how they are described — is an important problem for companies to solve if they want to improve data quality and outcomes, but the process can be a complicated one. Brian Macy, director … continue reading
This week is Data Privacy Week, with this year’s event theme being “take control of your data.” Hosted by the National Cybersecurity Alliance, Data Privacy Week (and Day) is an annual outreach event that focuses on a different theme each year. “With this year’s theme of ‘Take Control of Your Data,’ Data Privacy Week holds … continue reading
Drowning in documents? Buried in emails? Forget endless scrolling and time-consuming searches. dtSearch is here to be your information hero, offering lightning-fast retrieval across mountains of data. Whether you’re a lone researcher or a large enterprise, dtSearch has a solution tailored to your needs. At its core, dtSearch is a powerful full-text search engine. It … continue reading
When low-code platforms first burst onto the scene, many considered them game-changers. The ability to rely less on traditional programmers, despite limited coding knowledge, promised a democratizing revolution in software, even if questions of governance at scale, security, and long-term maintenance were not yet fully resolved. But as businesses grew and innovated, many companies still … continue reading
Data Profiler is an open-source Python library that originated at Capital One to analyze datasets and detect if any of the information contained within is sensitive data, such as bank account numbers, credit card information, or social security numbers. According to the company, when data streams grow large enough, it can be quite difficult to … continue reading
Quest Software, a provider of systems management, data protection, and security software, has announced the general availability of Toad Data Studio. This all-in-one platform is designed to streamline database management across multi-database platform environments. The release comes at a time when the complexity of database infrastructure is increasing and enterprises are struggling with agility and … continue reading
DataStax has announced that it has revamped the developer experience for Astra DB, which is a vector database used for building AI applications. First, it is releasing a new API for building generative AI applications: the Astra DB Data API. This new API helps developers create these apps by providing all of the necessary data … continue reading
Google has announced significant improvements to Google Cast, including the Output Switcher, accessible via the Android System UI. This feature facilitates the transfer and control of media across different devices and technical protocols. With the release of Output Switcher 2.0 on Android U, improvements include enhanced volume control, device categorization, and support for devices with … continue reading
The Data Quality 2023 Study study reveals that a significant 34% of the organizations responding are at the ‘Data Aware’ stage, indicating they are in the initial phases of recognizing the importance of data but have not yet fully integrated it into their decision-making processes. However, the most advanced stage, ‘Data Driven’, where data is … continue reading
The database company MongoDB has announced new capabilities to enable companies to better leverage generative AI. MongoDB Atlas Vector Search is now generally available, which allows customers to build generative AI into their applications based on their own data. This enables the AI to provide accurate, relevant responses for a specific organization or domain. Customers … continue reading
There are many reasons why duplicate entries might end up in a database, and it’s important that companies have a way to deal with those to ensure their customer data is as accurate as possible. In Episode 5 of the SD Times Live! Microwebinar series of data verification, Tim Sidor, data quality analyst at data … continue reading