Skip to content
Advertisement

tf.data: create a Dataset from a list of Numpy arrays of different shape

I have a list of Numpy arrays of different shape.

I need to create a Dataset, so that each time an element is requested I get a tensor with the shape and values of the given Numpy array.

How can I achieve this?

This is NOT working:

dataset = tf.data.Dataset.from_tensor_slices(list_of_arrays)

since you get, as expected:

Can’t convert non-rectangular Python sequence to Tensor.

p.s. I know that it will not be possible to batch a Dataset with elements of different shapes.

Advertisement

Answer

Have you tried converting initially to a ragged tensor?

tensor_with_from_dimensions = tf.ragged.constant([[1, 2], [3], [4, 5, 6]])

Bear in mind that:

All scalar values in pylist must have the same nesting depth K, and the returned RaggedTensor will have rank K. If pylist contains no scalar values, then K is one greater than the maximum depth of empty lists in pylist. All scalar values in pylist must be compatible with dtype.

You can read more about it here : https://www.tensorflow.org/api_docs/python/tf/ragged/constant

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement