Data Management explained

Data is the information that drives business. It can be structured in rows and columns, like a customer name, address, and phone; and it can be unstructured, such as an email or a social media post. Structured data is what is populated in Relational Database Management Systems such as those created by Oracle, IBM and Microsoft, and open-source PostgreSQL and MySQL, among others. That data can be accessed using the standard Structured Query Language (SQL). Unstructured data resides in what are called NoSQL databases, such as Cassandra, Couchbase, MongoDB and many, many others. Many organizations today run both kinds of databases.

Once the data is stored, it must be easily retrievable, found amid the mountains of data organizations collect, and made available at scale. Numerous tools exist for those jobs, including Hadoop, Apache Spark and many more. It is through the collection and analysis of data that businesses can make decisions that affect their bottom line.

New report highlights overconfidence vs reality of AI implementation

With all the potential benefits promised by the use of AI, it’s no wonder companies are wanting to get in on the action. But a new survey from Capital One reveals a stark disconnect between how confident business leaders are in their company’s ability to implement AI and how the technology professionals actually implementing the … continue reading

Snowflake releases new capabilities for companies to better collaborate around their data

The data platform Snowflake is hosting its annual user conference, BUILD 2024, bringing together data scientists and developers and sharing new functionality across its platform that will enable customers to get more value from their data and build AI functionality on top of it. New updates across Snowflake platform enable greater collaboration, flexibility, and security … continue reading

Elastic adopts more efficient approach for storing vectorized data

Elastic is implementing a new approach for storing vectorized data that will require 95% less memory.  Better Binary Quantization, or BBQ, is based on a technique called RaBitQ, which was developed earlier this year by researchers at Nanyang Technological University Singapore.  According to Elastic, the biggest differences between BBQ and native binary quantization are that: … continue reading

Navigating the complexities of managing global address data

The U.S. Postal Service (USPS) delivers mail to almost 167 million addresses in the United States, and anyone who has tried to order something online has likely had the experience of not getting a package delivered on time (or at all) because the address was entered incorrectly or in a weird format, causing shipping delays. … continue reading

Microsoft enhances Data Wrangler with the ability to prepare data using natural language with new GitHub Copilot integration

Microsoft has announced that GitHub Copilot is now integrated with Data Wrangler, an extension for VS Code for viewing, cleaning, and preparing data.  By integrating GitHub Copilot capabilities into the tool, users will now be able to clean and transform data in VS Code with natural language prompts. It will also be able to provide … continue reading

Google open sources Java-based differential privacy library

Google has announced that it is open sourcing a new Java-based differential privacy library called PipelineDP4J.  Differential privacy, according to Google, is a privacy-enhancing technology (PET) that “allows for analysis of datasets in a privacy-preserving way to help ensure individual information is never revealed.” This enables researchers or analysts to study a dataset without accessing … continue reading

How Melissa’s Global Phone service cuts down on data errors and saves companies money

Having the correct customer information in your databases is necessary for a number of reasons, but especially when it comes to active contact information like email addresses or phone numbers. “Data errors cost users time, effort, and money to resolve, so validating phone numbers allows users to spend those valuable resources elsewhere,” explained John DeMatteo, … continue reading

Microsoft open-sources Drasi, a data processing system for detecting and reacting to changes

Microsoft has announced and is open-sourcing a new data processing system called Drasi that can detect and react to changes in complex systems. This new project “simplifies the automation of intelligent reactions in dynamic systems, delivering real-time actionable insights without the overhead of traditional data processing methods,” Mark Russinovich, CTO, deputy chief information security officer, … continue reading

MongoDB 8.0 offers significant performance improvements to read throughput, bulk writes, and more

MongoDB has announced the release of the latest version of its database platform—MongoDB 8.0. According to the company, this release offers significant performance improvements compared to MongoDB 7.0, such as 36% better read throughput, 56% faster bulk writes, 20% faster concurrent writes during replication, and 200% faster handling of higher volumes of time series data, … continue reading

New Chrome security features seek to better protect user privacy

Google is announcing several new Chrome features aimed at better protecting users as they browse the web.  Safety Check — a tool that checks for compromised passwords, Chrome updates, and other potential security issues in the browser — has been updated to run automatically in the background so that it can be more proactive in … continue reading

Three considerations to assess your data’s readiness for AI

Organizations are getting caught up in the hype cycle of AI and generative AI, but in so many cases, they don’t have the data foundation needed to execute AI projects. A third of executives think that less than 50% of their organization’s data is consumable, emphasizing the fact that many organizations aren’t prepared for AI.  … continue reading

Podcast: How time series data is revolutionizing data management

Time series data is an important component of having IoT devices like smart cars or medical equipment that work properly because it is collecting measurements based on time values.  To learn more about the crucial role time series data plays in today’s connected world, we invited Evan Kaplan, CEO of InfluxData, onto our podcast to … continue reading

1 2 3 61
DMCA.com Protection Status