Tensorflow 2.3, Tensorflow dataset, TypeError: () takes 1 positional argument but 4 were given

Question

I use tf.data.TextLineDataset to read 4 large files and I use tf.data.Dataset.zip to zip these 4 files and create "dataset". However, I can not pass "dataset" to dataset.map to use tf.compat.v1.string_split and split with t separator and finally use batch, prefetch and finally feed into my model. This is my code: This is error message: What should I do? Answer

Accepted Answer

The tf.data.Dataset.zip function iterates over an arbitrary number of dataset objects at the same time. In other words, if you zip over four datasets, you will get four items at each iteration (one from each dataset). This also explains the error OP receivedTypeError: <lambda>() takes 1 positional argument but 4 were givenThe function being mapped needs to be able to handle four arguments, because it is being applied to a zip of four datasets. The code below includes a function that takes four arguments (datasets) and splits them by t. You can map this to the zipped dataset. I substituted the tf.data.TextLineDataset objects with sample datasets.import tensorflow as tfd1 = tf.data.Dataset.from_tensors(["foot1"])d2 = tf.data.Dataset.from_tensors(["foot2"])d3 = tf.data.Dataset.from_tensors(["foot3"])d4 = tf.data.Dataset.from_tensors(["foot4"])def split_by_tab(text1, text2, text3, text4):    sep = "t"    return (        tf.strings.split(text1, sep=sep),        tf.strings.split(text2, sep=sep),        tf.strings.split(text3, sep=sep),        tf.strings.split(text4, sep=sep),    )dataset = tf.data.Dataset.zip((d1,d2,d3,d4))dataset = dataset.map(split_by_tab)As alternative, I can merge these file and create a very large file and then shuffle, batch and prefetch rows from it. Right? Any other solution?The files could be merged, but if they are large, it&#8217;s probably not worth doing. I did not realize that the features were split across multiple files. In this case, zipping is a reasonable thing to do.There is also a library tensorflow_text that may be relevant to this question. Might be worth checking out.

Advertisement

Answer