Today, data has become the main currency of organizations, being used for analytics, AI/ML, and enabling better data-driven decisions. But all of that data results in a lot of added complexity for data professionals, which in turn slows the time to insight and leaves a lot of the value from data unrealized.
“How cloud object storage is handled today is that data is passed on to an engine, such as Spark, where you can run either Java code or Python code. And different users are going to run that in different clusters, different environments that are being managed, secured, and all that stuff. And so there’s all the infrastructure management related to that,” said Julian Forero, senior product marketing manager at Snowflake, in an SD Times Live! event.
According to Forero, this introduces challenges around processing, complexity, and capacity management, and generates lots of silos since data is being copied to different environments.
In order to overcome this, Snowflake built Snowpark, which allows data professionals to use their programming language of choice, collaborate in the same platform, and use the same data. At the same time, they will still get the benefits of simplicity, access, performance, scalability, governance, and security.
According to the Snowpark documentation, there are a number of features that set it apart from other client libraries, including constructs for building SQL statements, lazy execution of operations on the server, and the ability to create user-designed functions.
It can be used with the Python, Java, and Scala languages. Example use cases include processing semi-structured and unstructured data or giving business users access to data science.
To showcase Snowpark, Caleb Baechtold, field CTO and architect of data science at Snowflake, gave SD Times viewer a demo of the product during the free webinar. To learn more about how it works, give the video a watch.