Loading a large dataset from CSV files in TensorFlow

Question

I use the following code to load a bunch of images in my data set in TensorFlow, which works well: I am wondering how I can use a similar code to load a bunch of CSV files. Each CSV file has a shape 256 x 256 and can be assumed as a grayscale image. I don't know what I should

Accepted Answer

You can achieve this by changing a few things in the load function like below.def load(image_file):      image_file = bytes.decode(image_file.numpy())      image = pd.read_csv(image_file)      image = image.values      image = tf.convert_to_tensor(image, dtype=tf.float32,)      return image  train_dataset = tf.data.Dataset.list_files(PATH+"/*.csv")print(train_dataset)train_dataset = train_dataset.map(lambda x: tf.py_function(load,[x],[tf.float32]) , num_parallel_calls=tf.data.experimental.AUTOTUNE)  Wrap the load fucntion with tf.py_function in map, so you can use decode the file name.Example output:for i in train_dataset.take(1):  print(i) (<tf.Tensor: shape=(256, 256), dtype=float32, numpy=array([[255., 255., 255., ..., 255., 255., 255.],       [255., 255., 255., ..., 255., 255., 255.],       [255., 255., 255., ..., 255., 255., 255.],       ...,       [255., 255., 255., ..., 255., 255., 255.],       [255., 255., 255., ..., 255., 255., 255.],       [255., 255., 255., ..., 255., 255., 255.]], dtype=float32)>,)

Advertisement

Answer