I made simple dataset like below.
x_data = [[0, 0],
[0, 1],
[1, 0],
[1, 1]]
y_data = [[0],
[1],
[1],
[0]]
And I slice it by using from_tensor_slices:
(I don’t know exact role of tensor slice function…)
dataset = tf.data.Dataset.from_tensor_slices((x_data, y_data)).batch(len(x_data))
when I print dataset using print function, it shows like below:
<BatchDataset shapes: ((None, 2), (None, 1)), types: (tf.int32, tf.int32)>
and when I print it using for loop it show like below:
tf.Tensor( [[0 0] [0 1] [1 0] [1 1]], shape=(4, 2), dtype=int32) tf.Tensor( [[0] [1] [1] [0]], shape=(4, 1), dtype=int32)
Here is question:
In my idea, tensor shape should be (4,2) and (4,1) because row of matrix is 4.
Why when I use print, it shows (None,2) and (None,1)?
And how to print value of tensor without for loop?
Advertisement
Answer
1- What is from_tensor_slices?
- When you use
from_tensor_slicesit creates a tensorflow dataset from your input tensors.
2- What is the benefits of using a tensorflow dataset?
- It makes everything you need to do with a dataset, very easy. i.e. you can easily make them
shuffle,batch,preprocess data bymapand even easily feed to your model likemodel.fit(dataset)etc.
3- Why print function shows BatchDataset not the values?
datasetvariable is an object fromBatchDatasetclass (since you defined it likedataset=from_tensor_slices((x,y)).batch(bs)). It is not a python list, eager tensor, numpy array and … to see its values byprintfunction.
4- What can I do to see the values stored in a tf dataset?
- You can access its values by using
take()function from this class:
one_batch = dataset.take(1) # it takes 1 batch of data from dataset
# each batch is a tuple (like what you passed in from_tensor_slices)
# you passed x and y. So, it returns a batch of x and y
for x,y in one_batch:
print(x.shape)
print(y.shape)
#(4,2) (batch_size, num_features)
#(4,1) (batch_size, labels_dim)
5- What are (None,2) and (None,1) in BatchDataset object variable?
- It is the size of
x=(None,2)andy=(None,1). First dimension isNone.Nonein the shapes, means that the first dimension ofxin this dataset (first dimension is number of samples) can be anything, but the second dimension is 2. And the same rule fory.
6- How to print values without for loop?
- Actually for performance dealing it acts like generators. You can not print all values in once. You can access its elements one by one (batch by batch).