I have created this basic stored procedure to query a Snowflake table based on a customer id:
CREATE OR REPLACE PROCEDURE SP_Snowpark_Python_Revenue_2(site_id STRING) RETURNS STRING LANGUAGE PYTHON RUNTIME_VERSION = '3.8' PACKAGES = ('snowflake-snowpark-python') HANDLER = 'run' AS $$ from snowflake.snowpark.functions import * def run(session, site_id): df_rev_tmp = session.table("revenue").select(col("site_id"), col("subscription_id"), col("country_name"), col("product_name")) df_rev_final = df_rev_tmp.filter(col("site_id") == site_id) return "SUCCESS" $$;
It works fine but I would like my sproc to return a JSON object for the whole result set. I modified it thusly:
CREATE OR REPLACE PROCEDURE SP_Snowpark_Python_Revenue_3(site_id STRING) RETURNS STRING LANGUAGE PYTHON RUNTIME_VERSION = '3.8' PACKAGES = ('snowflake-snowpark-python') HANDLER = 'run' AS $$ from snowflake.snowpark.functions import * def run(session, site_id): df_rev = session.table("revenue").select(col("site_id"), col("subscription_id"), col("country_name"), col("product_name")) df_rev_tmp = df_rev.filter(col("site_id") == site_id) df_rev_final = df_rev_tmp.to_pandas() df_rev_json = df_rev_final.to_json(orient = 'columns') return df_rev_json $$;
It compiles without errors but fails at runtime with this error:
CALL SP_Snowpark_Python_Revenue_3('dfgerr6223')..... 255002: Optional dependency: 'pyarrow' is not installed...
What am I missing here?
Advertisement
Answer
You need to ask for pyarrow
as a package:
PACKAGES = ('snowflake-snowpark-python', 'pyarrow')
But to get these packages, someone in your org will need to approve the Anaconda terms of service, or you’ll get the following error:
SQL compilation error: Anaconda terms must be accepted by ORGADMIN to use Anaconda 3rd party packages. Please follow the instructions at https://…
Someone with ORGADMIN role can follow these steps: