I’d like to write a Snowflake scalar UDF in Python which takes a variable number of string arguments. The idea I had was to use array type for this (i.e. array can handle variable number of elements) but the issue I had with this was that as soon as a single element of array argument was NULL, Snowflake threw an error and I need to be able to handle these NULLs.
Code:
create or replace function udx_py_uuid5_null(a array) returns string language python runtime_version = 3.8 handler = 'f_uuid5' as $$ import uuid from typing import List, Optional _null_uuid = uuid.UUID('00000000-0000-0000-0000-000000000000') def f_uuid5(args: List[Optional[str]]): args = [arg if not getattr(arg, "is_sql_null", False) else '' for arg in args] return str(uuid.uuid5(_null_uuid, ':'.join(args))) $$;
I’m trying to handle NULLs as described here but it seems to me as if this only applies to “first-level” arguments. As shown in the solution, desired behaviour is to replace NULLs with empty string before calculating the UUID.
Call:
select udx_py_uuid5_null([NULL, 'test', 'test1', NULL]);
Error message:
Python Interpreter Error: Traceback (most recent call last): File "_udf_code.py", line 7, in f_uuid5 TypeError: sequence item 0: expected str instance, NoneType found in function UDX_PY_UUID5_NULL with handler f_uuid5
Advertisement
Answer
Instead of trying to replacing NULLs with empty string why not simply remove None
values:
args = [arg if not getattr(arg, "is_sql_null", False) else '' for arg in args]
=>
args = list(filter(lambda arg: arg is not None, args))
Full code:
create or replace function udx_py_uuid5_null(a array) returns string language python runtime_version = 3.8 handler = 'f_uuid5' as $$ import uuid from typing import List, Optional _null_uuid = uuid.UUID('00000000-0000-0000-0000-000000000000') def f_uuid5(args: List[Optional[str]]): args = list(filter(lambda arg: arg is not None, args)) return str(uuid.uuid5(_null_uuid, ':'.join(args))) $$; select udx_py_uuid5_null([NULL, 'test', 'test1', NULL]);
Output: