Topic: pandas

SD Times Open-Source Project of the Week: Text Extensions for Pandas

IBM recently announced the open-source library Text Extensions for Pandas, which features extensions that turn Pandas DataFrames into a universal data structure that can be used in natural language processing (NLP).  According to the company, the goal of this project is to make NLP simple. In creating the library, it wanted to avoid creating algorithms … continue reading

Guest View: The first release of Apache Arrow

Work on Apache Arrow has been progressing rapidly since its inception earlier this year, and now Arrow is the open-source standard for columnar in-memory execution, enabling fast vectorized data processing and interoperability across the Big Data ecosystem. Background Apache Parquet is now the de facto standard for columnar storage on disk, and building on that … continue reading

DMCA.com Protection Status