The Snowpark pandas API lets you run your pandas code directly on your data in Snowflake. Just by changing the import statement and a few lines of code, you can get the same pandas-native experience you know and love at the speed and scale of Snowflake. See the Snowpark pandas API documentation to learn more.
- Download the notebook file and data from the corresponding directory.
- Create a new conda environment using the command:
conda create --name snowpark-pandas-demo python=3.9 --y
- Activate the conda environment with:
conda activate snowpark-pandas-demo
- Install the Snowpark python library with Modin:
pip install "snowflake-snowpark-python[modin]" - Install the jupyter package:
pip install jupyter
- Launch Jupyter Notebook:
jupyter notebook
- Open the notebook.
- Follow the instructions here to set up a default Snowflake connection. For example,
# Create a Snowpark session with a default connection. from snowflake.snowpark.session import Session session = Session.builder.create()
- Import Modin and the Snowpark pandas plugin for Modin
import modin.pandas as spd import snowflake.snowpark.modin.plugin
- Start using Snowpark pandas!