Data sprawl is getting worse and worse, with 43% of respondents to a new survey claiming that they use an average of four to six platforms to manage their data. Another 11% use an average of 10-12 platforms.
The survey, “The State of Data and What’s Next,” was commissioned by Red Hat and Starburst and conducted by Enterprise Management Associates (EMA).
The research also revealed that companies are continuing to add new data types to their environments. For example, 65% of respondents plan to collect streaming data and 60% plan to collect video and event data.
Another key highlight of the report is that there are still bottlenecks in data pipelines. Forty-eight percent of respondents take over 24 hours to create a data pipeline and then another 24 hours to actually move those pipelines into production. This pain point is pushing many organizations towards a more decentralized model such as data meshes.
AI and machine learning are placing increasing pressure on these systems as well since they are heavily dependent on data. Thirty-one percent of respondents said that it’s difficult to find data for their models because it’s constantly being moved around. In order to remedy this situation, more automation of AI and machine learning workloads is needed, as well as better data access.
“Customers creating AI/ML enabled applications must rely on accessible data in order to accelerate model development and the deployment of intelligent applications across hybrid multicloud environments,” said Steven Huels, senior director of software engineering at Red Hat. “By creating a foundation for data and applications on cloud architecture, developers and data scientists can more quickly and repeatedly meet their business goals through the delivery of data-driven, intelligent applications.”