Skip to content
Advertisement

Why tensor shape is difference when i use tf.print?

I made simple dataset like below.

x_data = [[0, 0],
          [0, 1],
          [1, 0],
          [1, 1]]
y_data = [[0],
          [1],
          [1],
          [0]]

And I slice it by using from_tensor_slices: (I don’t know exact role of tensor slice function…)

dataset = tf.data.Dataset.from_tensor_slices((x_data, y_data)).batch(len(x_data))

when I print dataset using print function, it shows like below:

<BatchDataset shapes: ((None, 2), (None, 1)), types: (tf.int32, tf.int32)>

and when I print it using for loop it show like below:

tf.Tensor(
[[0 0]
 [0 1]
 [1 0]
 [1 1]], shape=(4, 2), dtype=int32) 
tf.Tensor(
[[0]
 [1]
 [1]
 [0]], shape=(4, 1), dtype=int32)

Here is question:

In my idea, tensor shape should be (4,2) and (4,1) because row of matrix is 4.

Why when I use print, it shows (None,2) and (None,1)?

And how to print value of tensor without for loop?

Advertisement

Answer

1- What is from_tensor_slices?

  • When you use from_tensor_slices it creates a tensorflow dataset from your input tensors.

2- What is the benefits of using a tensorflow dataset?

  • It makes everything you need to do with a dataset, very easy. i.e. you can easily make them shuffle, batch,preprocess data by map and even easily feed to your model like model.fit(dataset) etc.

3- Why print function shows BatchDataset not the values?

  • dataset variable is an object from BatchDataset class (since you defined it like dataset=from_tensor_slices((x,y)).batch(bs)). It is not a python list, eager tensor, numpy array and … to see its values by print function.

4- What can I do to see the values stored in a tf dataset?

  • You can access its values by using take() function from this class:
one_batch = dataset.take(1) # it takes 1 batch of data from dataset

# each batch is a tuple (like what you passed in from_tensor_slices) 
# you passed x and y. So, it returns a batch of x and y
for x,y in one_batch:      
    print(x.shape)
    print(y.shape)
#(4,2) (batch_size, num_features)
#(4,1) (batch_size, labels_dim)

5- What are (None,2) and (None,1) in BatchDataset object variable?

  • It is the size of x=(None,2) and y=(None,1). First dimension is None. None in the shapes, means that the first dimension of x in this dataset (first dimension is number of samples) can be anything, but the second dimension is 2. And the same rule for y.

6- How to print values without for loop?

  • Actually for performance dealing it acts like generators. You can not print all values in once. You can access its elements one by one (batch by batch).
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement