Pandas is the go-to library for processing data in Python. It’s easy to use and quite flexible when it comes to handling different types and sizes of data. It has tons of different functions that make manipulating data a breeze.
But there is one drawback: Pandas is slow for larger datasets.
The RAPIDS cuDF library is a GPU DataFrame manipulation library based on Apache Arrow that accelerates loading, filtering, and manipulation of data for model training data preparation. The RAPIDS GPU DataFrame provides a pandas-like API that will be familiar to data scientists, so they can now build GPU-accelerated workflows more easily.
In this video, I’ll show you the speed up provided by cuDF.