Skip to content
Advertisement

why python split-folders is messing data when splitting into train and test?

I want to split a folder of image into train and test using python split-folders and i gave a zero in second part which represent validation, the problem is that the original folder contains 155 images and the splitting gave 57 in train and 56 in test so idk where are the left 42 images ? this is what i tried :

!pip install split-folders

import splitfolders 

splitfolders.ratio(
    "/content/drive/MyDrive/DatasetTuberculosis /images", 
    output="/content/drive/MyDrive/DatasetTuberculosis /Dataset", 
    seed=1337, 
    ratio=(.8, .0, .2), 
    group_prefix=None
)

Train_path = "/content/drive/MyDrive/DatasetTuberculosis /Dataset/train"
Test_path = "/content/drive/MyDrive/DatasetTuberculosis /Dataset/test"


print(len(Train_path))
print(len(Test_path))

Output :

57
56

Advertisement

Answer

You are not printing the length of elements in that folder. You are just printing the length of the string "/content/drive/MyDrive/DatasetTuberculosis /Dataset/test"

You can use glob to open a folder and get the files inside that folder:

import glob
print(len(glob.glob(Train_path)))
Advertisement