Skip to content
Advertisement

Behavior of Dataset.map in Tensorflow

I’m trying to take variable length tensors and split them up into tensors of length 4, discarding any extra elements (if the length is not divisible by four).

I’ve therefore written the following function:

JavaScript

This produces the output [<tf.Tensor: shape=(4,), dtype=int32, numpy=array([1, 2, 3, 4], dtype=int32)>], as expected.

If I now run the same function using Dataset.map:

JavaScript

I instead get the following error

JavaScript

I see that this is because token_length is None, but I don’t understand why. I assume this has something to do with graph vs eager execution, but the function works if I call it outside of .map even if I annotate it with @tf.function.

Why is the behavior different inside .map? (Also: is there any better way of writing the batches_of_four function?)

Advertisement

Answer

You should use tf.shape to get the dynamic shape of a tensor in graph mode:

JavaScript

And another problem you have is using a scalar tensor as the number of splits in graph mode. That won’t work either.

Try this:

JavaScript
JavaScript

And see my other answer here.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement