Data is the information that drives business. It can be structured in rows and columns, like a customer name, address, and phone; and it can be unstructured, such as an email or a social media post. Structured data is what is populated in Relational Database Management Systems such as those created by Oracle, IBM and Microsoft, and open-source PostgreSQL and MySQL, among others. That data can be accessed using the standard Structured Query Language (SQL). Unstructured data resides in what are called NoSQL databases, such as Cassandra, Couchbase, MongoDB and many, many others. Many organizations today run both kinds of databases.
Once the data is stored, it must be easily retrievable, found amid the mountains of data organizations collect, and made available at scale. Numerous tools exist for those jobs, including Hadoop, Apache Spark and many more. It is through the collection and analysis of data that businesses can make decisions that affect their bottom line.
Getting accurate address data for customers is a challenge on its own, but getting accurate legislative district data add an entirely new level of difficulty on top. There are a number of reasons why developers might need access to that data however, such as advocacy groups trying to do outreach that involves connecting voters to … continue reading
PostgreSQL 18 has been released, with several new features like asynchronous I/O, better post-upgrade performance, and improved text processing. Asynchronous I/O allows PostgreSQL to issue multiple I/O requests at the same time rather than waiting for one to finish before starting the next. According to the PostgreSQL team, this improves overall throughput, and has resulted … continue reading
CData has announced the launch of a new managed Model Context Protocol (MCP) platform bringing together AI assistants, agent orchestration, workflow automation, and embedded AI applications—combined with access to over 300 enterprise data sources. According to the company, Connect AI preserves data semantics and relationships in enterprise data to give AI agents better context while … continue reading
A number of data companies—including Snowflake, Salesforce, BlackRock, dbt Labs, and RelationalAI—have announced the formation of a new open source initiative to create a vendor- neutral standard for defining and sharing semantic metadata. The Open Semantic Interchange has three main goals: enhance interoperability across tools and platforms, accelerate adoption of AI and BI applications, and … continue reading
Today at its user conference MongoDB.local NYC, the popular database company announced that the Search and Vector Search capabilities that have been available in the Atlas cloud platform are now available in preview in the Community Edition and Enterprise Server. Previously, customers using self-managed versions of MongoDB would have needed to use a third-party service … continue reading
Microsoft announced the latest innovations to Microsoft Fabric at a user conference for the platform, FabCon. Microsoft Fabric is a platform that brings data from multiple sources into one place. The company announced the launch of Graph, a low-code platform for modeling and analyzing relationships in data. According to Microsoft, Graph is based on the … continue reading
Today at the Open Source Summit Europe, The Linux Foundation announced that the open-source document database, DocumentDB, would be joining the foundation and be released under the MIT license. DocumentDB was created by Microsoft and launched earlier this year. Since its release, it has gained 1.9k stars and hundreds of contributions, feedback, and users, according … continue reading
Melissa, a provider of data quality solutions, is making it possible for customers to run its SQL Server Integration Services (SSIS) components in Azure Data Factory through the Azure-SSIS Integration Runtime. This brings the power of Melissa’s data quality features to the cloud, providing customers greater flexibility in where their data is stored. SSIS is … continue reading
The team behind the open-source distributed NoSQL database ScyllaDB has announced a new iteration of its managed offering, this time focusing on adapting workloads based on demand. ScyllaDB X Cloud can scale up or down within a matter of minutes to meet actual usage, eliminating the need to overprovision for worst-case scenarios or deal with … continue reading
At its Data + AI Summit, Databricks announced several new tools and platforms designed to better support enterprise customers who are trying to leverage their data to create company-specific AI applications and agents. Lakebase Lakebase is a managed Postgres database designed for running AI apps and agents. It adds an operational database layer to Databricks’ … continue reading
As every company moves to implement AI in some form or another, data is king. Without quality data to train on, the AI likely won’t deliver the results people are looking for and any investment made into training the model won’t pay off in the way it was intended. “If you’re training your AI model … continue reading
ABBYY is introducing a new optical character recognition (OCR) API to enable developers to extract data from unstructured documents. “As a vanguard of OCR, ABBYY has long had a vibrant community of cutting-edge developers creating transformational solutions with our advanced document AI,” said Nick Hyatt, vice president of Engineering R&D at ABBYY. “ABBYY Document AI … continue reading