Skip to content
Advertisement

Sort filepaths according to their respective file extensions

I am trying to sort filepaths according to their respective file extensions.

I would like to have an output like this:

FileType FilePath
.h a/b/c/d/xyz.h
.h a/b/c/d/xyz1.h
.class a/b/c/d/xyz.class
.class a/b/c/d/xyz1.class
.jar a/b/c/d/xyz.jar
.jar a/b/c/d/xyz1.jar

But the output I have now is like this: output in excel

Below is my code:

import pandas as pd
import glob

path = "The path goes here"

yes = [glob.glob(path+e,recursive = True) for e in ["/**/*.h","/**/*.class","/**/*..jar"]]

print(type(yes))  #File type is list
    
df = pd.DataFrame(yes)
df = df.transpose()
df.columns = [".h", ".class",".jar"]
print (df)

writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='filepath', index=False)
writer.save()

Could anyone please help me with this. Thanks in advance!

Advertisement

Answer

Please try this code:

import os
import pathlib
import pandas as pd

path = 'C:/'

full_file_paths = []
file_suffix = []
for (root,dirs,files) in os.walk(path): 
        for f in files:
            file_suffix.append(pathlib.PurePosixPath(f).suffix)
            full_file_paths.append(path+f)
        
file_suffix = set(file_suffix)
processed_files = dict()
for fs in file_suffix:
    processed_files[fs]=[]
    for f in full_file_paths:
        if f.find(fs) > 0:
            processed_files[fs].append(f)
    print ('--------------------------------') 
    print(fs)
    print(processed_files[fs])
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement