Skip to content
Advertisement

How to vectorize a function with lists as argument?

I need help vectorizing a function in numpy. In Julia, I can do something like that:

((a,b,c) -> [a,b,c]).([[1,2],[3,4]],[[5,6],[7,8]],nothing)

which returns

2-element Vector{Vector{Union{Nothing, Vector{Int64}}}}:
 [[1, 2], [5, 6], nothing]
 [[3, 4], [7, 8], nothing]

It takes one sublist at a time from the iterables and expands nothing.

In Python, I just can’t get to have a similar behaviour. I tried:

np.vectorize(lambda a,b,c: [a,b,c])([[1,2], [3,4]], [[5,6], [7,8]], None)

but it returns:

array([[list([1, 5, None]), list([2, 6, None])],
       [list([3, 7, None]), list([4, 8, None])]], dtype=object)

If I do:

np.vectorize(lambda a,b,c: print(a,b,c))([[1,2], [3,4]], [[5,6], [7,8]], np.nan)

I get back:

1 5 nan
1 5 nan
2 6 nan
3 7 nan
4 8 nan

I tried with excluded parameter, but il excludes the whole array:

np.vectorize(lambda a,b,c: print(a,b,c), excluded=[0])([[1,2], [3,4]], [[5,6], [7,8]], np.nan)

prints:

[[1, 2], [3, 4]] 5 nan
[[1, 2], [3, 4]] 5 nan
[[1, 2], [3, 4]] 6 nan
[[1, 2], [3, 4]] 7 nan
[[1, 2], [3, 4]] 8 nan

By the way, the actual function is a sklearn function, not a lambda one.

Advertisement

Answer

You gave it a (2,2), (2,2) and scalar arguments. np.vectorized called your function 4 times, each time with a tuple of values from those 3 (broadcasted together).

You also see that with the print version. There’s an additional tuple at the start, used to determine the return dtype, which in this case is a list, so dtype=object.

With the exclude it doesn’t iterate on the values of the 1st argument, rather it just passes it whole.

Here’s the right way to create your list of lists:

In [811]: a,b,c = [[1,2], [3,4]], [[5,6], [7,8]], None

In [813]: [[i,j,None] for i,j in zip(a,b)]
Out[813]: [[[1, 2], [5, 6], None], [[3, 4], [7, 8], None]]

If we add a signature (and otypes):

In [821]: f = np.vectorize(lambda a,b,c: [a,b,c], signature='(n),(n),()->()', otypes=[object])
In [822]: f(a,b,c)
Out[822]: 
array([list([array([1, 2]), array([5, 6]), None]),
       list([array([3, 4]), array([7, 8]), None])], dtype=object)

Now it calls the function only twice. But the result is much slower. Read, and reread, the notes about performance.

If we make the list arguments into arrays first:

In [825]: A,B = np.array(a), np.array(b)
In [826]: A,B
Out[826]: 
(array([[1, 2],
        [3, 4]]),
 array([[5, 6],
        [7, 8]]))

the signature f returns the same thing, showing that vectorize does convert the lists to arrays:

In [827]: f(A,B,c)
Out[827]: 
array([list([array([1, 2]), array([5, 6]), None]),
       list([array([3, 4]), array([7, 8]), None])], dtype=object)

If we passed the arrays to the list comprehension, we can get:

In [829]: np.array([[i,j,None] for i,j in zip(A,B)], object)
Out[829]: 
array([[array([1, 2]), array([5, 6]), None],
       [array([3, 4]), array([7, 8]), None]], dtype=object)
In [830]: _.shape
Out[830]: (2, 3)
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement