Data is the information that drives business. It can be structured in rows and columns, like a customer name, address, and phone; and it can be unstructured, such as an email or a social media post. Structured data is what is populated in Relational Database Management Systems such as those created by Oracle, IBM and Microsoft, and open-source PostgreSQL and MySQL, among others. That data can be accessed using the standard Structured Query Language (SQL). Unstructured data resides in what are called NoSQL databases, such as Cassandra, Couchbase, MongoDB and many, many others. Many organizations today run both kinds of databases.
Once the data is stored, it must be easily retrievable, found amid the mountains of data organizations collect, and made available at scale. Numerous tools exist for those jobs, including Hadoop, Apache Spark and many more. It is through the collection and analysis of data that businesses can make decisions that affect their bottom line.
Databricks has announced it is donating its open-source data lakes project to the Linux Foundation. Delta Lake is designed to improve the reliability, quality and performance of data lakes. Databricks first announced the project in April. “Today, nearly every company has a data lake they are trying to gain insights from, but data lakes have … continue reading
Melissa has announced new updates to its customer data verification solution Unison. Unison is a browser-based data cleaning and reporting solution designed to help data stewards create and maintain data quality without any programming knowledge. New features include a wizard-based matching interface, fuzzy match scoring, and improved reporting. “Our existing MatchUp deduplication software is known … continue reading
Companies are obsessing over data — whether it’s gathering data, analyzing data, or gaining insights from that data. But perhaps they’re not making the most of that data. The biggest challenge when it comes to data is not in the collection, storage, or analysis of that data, it’s how to effectively use that data to … continue reading
As any business leader will tell you, data is the lifeblood of organizations operating in the 21st century. A company’s ability to effectively gather and use data can make all the difference in its success. But a number of factors can compromise data’s health, making it unmanageable and therefore unusable for today’s businesses. Specifically, data … continue reading
Cloudera today announced the first instantiation of its enterprise data cloud, the Cloudera Data Platform, a native cloud service to manage data and workloads on any cloud. Many enterprises are creating multi-cloud strategies but face increased complexities due to having some workloads in Microsoft Azure, for instance, while others live in Amazon or Google Cloud, … continue reading
New Relic wants to provide users with more than just dashboard analytics for their applications. The company announced a new observability platform at its FutureStack conference today in New York City. New Relic One is designed to connect user experience and business data with capabilities like New Relic Logs, Traces, Metrics and AI. The platform … continue reading
Cloud-based data warehouse platforms are making it easier for organizations to enable self-service analytics capabilities that tap disparate data sources. Similarly, modern data lakes offered by the large public cloud providers allow developers to create or customize data models on an ad-hoc basis for machine learning, which enables artificial intelligence and automation. In either of … continue reading
HPCC Systems (High Performance Computing Cluster), a dba of LexisNexis Risk Solutions, is an open-source big-data computing platform. Flavio Villanustre, vice president technology and CISO at LexisNexis Risk Solutions, explained HPCC Systems’s evolution came as a necessity. “In 2000 we were getting into data analytics, using the platforms, databases, and data integration tools that were … continue reading
A number of Python projects are promising to transition to Python 3 by the end of the year. Currently, a majority of Python packages and projects support Python 3.x and Python 2.7. The Python programming language development team has announced support for Python 2.7 will end by the end of the year. Projects have had … continue reading
The need for more privacy while surfing the web has been a hot topic lately, and while some organizations are creating initiatives to bolster that need, Google found that an agreed upon set of standards was essential to steer user privacy in the right direction. To address this Google announced its Privacy Sandbox initiative. The … continue reading
As computing moves from on-premises to the public cloud and the edge, protecting has data has become more complex, prompting Intel, Google, Microsoft, the Linux Foundation and other technology partners to launch a cross-industry effort for organizations to safely share data insights through the Confidential Computing Consortium. RELATED CONTENT: Microsoft tackles data sharing between organizations “The … continue reading
One of the striking attributes of the contemporary state of digital transformation is the conjunction of the ubiquity of digitization with its incomplete realization in a multitude of consumer and business contexts. On one hand, digital transformation has succeeded in empowering consumers to access data and information about just about any topic—whether it be the … continue reading