Hello I am trying to find the number of rows for all files within a folder. I am trying to do this for a folder that contains only “.txt” files and for a folder that contains .”csv” files.
I know that the way to get the number of rows for a SINGLE “.txt” file is something like this:
file = open("sample.txt","r") Counter = 0 Content = file.read() CoList = Content.split("n") for i in CoList: if i: Counter += 1 print("This is the number of lines in the file") print(Counter)
Whereas for a SINGLE “.csv” file is something like this:
file = open("sample.csv") reader = csv.reader(file) lines= len(list(reader)) print(lines)
But how can I do this for ALL files within a folder? That is, how can I loop each of these procedures across all files within a folder and, ideally, export the output into an excel sheet with columns akin to these:
Filename Number of Rows 1.txt 900 2.txt 653
and so on and so on.
Thank you so much for your help.
Advertisement
Answer
You can use glob
to detect the files and then just iterate over them.
Other methods : How do I list all files of a directory?
import glob # 1. list all text files in the directory rel_filepaths = glob.glob("*.txt") # 2. (optional) create a function to read the number of rows in a file def count_rows(filepath): res = 0 f = open(filepath, 'r') res = len(f.readlines()) f.close() return res # 3. iterate over your files and use the count_row function counts = [count_rows(filepath) for filepath in rel_filepaths] print(counts)
Then, if you want to export this result in a .csv
or .xslx
file, I recommend using pandas
.
import pandas as pd # 1. create a new table and add your two columns filled with the previous values df = pd.DataFrame() df["Filename"] = rel_filepaths df["Number of rows"] = counts # 2. export this dataframe to `.csv` df.to_csv("results.csv")
You can also use pandas.ExcelWriter()
if you want to use the .xlsx
format. Link to documentation & examples : Pandas – ExcelWriter doc