I made simple dataset like below.
x_data = [[0, 0], [0, 1], [1, 0], [1, 1]] y_data = [[0], [1], [1], [0]]
And I slice it by using from_tensor_slices
:
(I don’t know exact role of tensor slice function…)
dataset = tf.data.Dataset.from_tensor_slices((x_data, y_data)).batch(len(x_data))
when I print dataset using print
function, it shows like below:
<BatchDataset shapes: ((None, 2), (None, 1)), types: (tf.int32, tf.int32)>
and when I print it using for
loop it show like below:
tf.Tensor( [[0 0] [0 1] [1 0] [1 1]], shape=(4, 2), dtype=int32) tf.Tensor( [[0] [1] [1] [0]], shape=(4, 1), dtype=int32)
Here is question:
In my idea, tensor shape should be (4,2)
and (4,1)
because row of matrix is 4.
Why when I use print
, it shows (None,2)
and (None,1)
?
And how to print value of tensor without for
loop?
Advertisement
Answer
1- What is from_tensor_slices
?
- When you use
from_tensor_slices
it creates a tensorflow dataset from your input tensors.
2- What is the benefits of using a tensorflow dataset?
- It makes everything you need to do with a dataset, very easy. i.e. you can easily make them
shuffle
,batch
,preprocess data bymap
and even easily feed to your model likemodel.fit(dataset)
etc.
3- Why print
function shows BatchDataset
not the values?
dataset
variable is an object fromBatchDataset
class (since you defined it likedataset=from_tensor_slices((x,y)).batch(bs)
). It is not a python list, eager tensor, numpy array and … to see its values byprint
function.
4- What can I do to see the values stored in a tf dataset?
- You can access its values by using
take()
function from this class:
one_batch = dataset.take(1) # it takes 1 batch of data from dataset # each batch is a tuple (like what you passed in from_tensor_slices) # you passed x and y. So, it returns a batch of x and y for x,y in one_batch: print(x.shape) print(y.shape) #(4,2) (batch_size, num_features) #(4,1) (batch_size, labels_dim)
5- What are (None,2)
and (None,1)
in BatchDataset
object variable?
- It is the size of
x=(None,2)
andy=(None,1)
. First dimension isNone
.None
in the shapes, means that the first dimension ofx
in this dataset (first dimension is number of samples) can be anything, but the second dimension is 2. And the same rule fory
.
6- How to print values without for
loop?
- Actually for performance dealing it acts like generators. You can not print all values in once. You can access its elements one by one (batch by batch).