Komprise, the leader in analytics-driven unstructured data management, announces the general availability of Komprise Intelligent AI Ingest. The new Smart Data Workflow ingestion engine speeds the curation of the right unstructured data across disparate storage silos for AI. Komprise Intelligent AI Ingest boosts AI ROI by eliminating the noise, data risk and high cost of using unstructured data in RAG and LLM pipelines.
In the recent Komprise AI Data and Enterprise Risk survey, IT leaders cited ingesting the right unstructured data to AI and AI data governance as two major challenges. AI hype has evolved to AI caution in the enterprise IT suite, as CIOs determine how to deploy AI safely and cost effectively with accurate outcomes that matter to company leadership.
Top issues with unstructured data ingestion to AI
- Suboptimal AI, RAG, LLM outcomes: Unstructured data is unorganized, containing large quantities of irrelevant, outdated and duplicate files. This reduces precision, clutters context windows and adds latency in AI pipelines. Studies show a 10% efficiency drop per 10,000 additional unstructured documents in typical RAG, leading to reduced accuracy and poor outcomes.
- High cost of inferencing: Irrelevant unstructured data wastes expensive AI processing resources, drives up costs, reduces accuracy and ultimately erodes AI ROI. The costs compound in AI inferencing as the augmentation occurs for each prompt.
- Sensitive data leakage security risk: Ingesting data in bulk can lead to inadvertent sensitive data exposure in AI tools, violating privacy, security and compliance policies.
Top features of Komprise Intelligent AI Ingest
- Metadata-rich Global File Index: Komprise automatically builds metadata and delivers a single view of all file data within the enterprise, at scale, so you can find precisely the right data for your AI use case with simple queries.
- Precise curating boosts RAG efficiency: Unlike traditional ETL and data ingestion approaches that provide connectors to blindly copy data from a source, Komprise delivers a surgical approach with rich filters to eliminate low quality and sensitive data during ingest.
- 2X ingestion performance improvement: Komprise doubles the performance, in benchmark tests against a data transfer tool from a major cloud provider. This is possible due to a purpose-built transfer engine that minimizes file overhead for AI supported by a massively parallel architecture.
- High performance parallel architecture: The Komprise elastic grid architecture is parallelized in layers across multiple network interfaces, share engines, and thread pools. This allows the solution to index and enrich metadata rapidly across billions of files and move large volumes of files to different AI tools and services as needed.
- Built-in PII, sensitive data handling: Komprise provides standard and custom sensitive data classification so you can reduce the risk of sensitive data leakage and compliance violations.
- Automated data governance auditing: Komprise automatically maintains an audit trail of each ingestion workflow for data governance and auditing, documenting the who, what, when and data lineage for compliance reporting.
“Our mission is to help organizations untangle the mess of unstructured data to gain the greatest competitive advantage with AI,” says Kumar K. Goswami, CEO of Komprise. “Komprise Intelligent AI Ingest is the latest advancement in Smart Data Workflows to solve a critical customer pain point of efficiently finding and moving the right data to AI.”
“As organizations accelerate their journey toward becoming data-driven, DSMS solutions are evolving into intelligent platforms that do far more than manage storage, “according to Gartner*. “Modern Data Storage Management Services (DSMS) solutions are foundational to business analytics and generative AI (GenAI) initiatives, helping enterprises unlock the full value of their data by making it more discoverable, contextualized and actionable.”
Komprise AI Partner Ecosystem
Komprise partners with the leading storage and cloud platform vendors so you can freely move your data to the right place at the right time. Komprise recently added two new AI ecosystem partners:
- NVIDIA: Nvidia customers can ingest the right unstructured data to Nvidia GPU-Direct storage and Nvidia NeMo DataStores and automatically manage the AI data lifecycle using Komprise. As a Nvidia Connect partner, Komprise collaborates with Nvidia to curate AI-ready unstructured data for model training and inferencing. Learn more.
- SUSE Linux: Komprise allows SUSE Rancher customers to catalog their unstructured data, profile it and ingest the right data to Rancher for AI use cases. This partnership helps both companies develop validated joint solutions for AI-ready data and data lifecycle management.