I have a folder let’s say as ‘input_Folder’ which has a list of CSV files with data. I’m trying to write a Python code which reads this list of CSV files from the input_folder and creates a master CSV file with two columns.
The columns in the master CSV files are ‘Scenario’ and ‘Status’.
Column Name requirement are as follows, Scenario = Name of the file from the directory and Status = if the file has a value in the second row of second column then populate as ‘Pass’ else ‘Fail’
Below is my code. After executing the code I’m able to see Master CSV created but with empty lines. I’m quite new to python so could somebody help me out here please
import os import csv path = (Input file path) with open("SUMMARY.csv", 'w') as csvfile: writer = csv.writer(csvfile) writer.writerow(['SCENARIO', 'STATUS']) for files in os.walk(path): for filename in files: with open("input_file.csv") as csv_file: #checking if the code is working for # one sample file all_rows = list(csv_file) line_count = 0 for row in all_rows[1:2]: if line_count == 1: if row[1].value == None: writer.writerow([os.path.basename(filename).split(".")[0], 'PASS']) else: writer.writerow([os.path.basename(filename).split(".")[0], 'FAIL']) line_count += 1
Advertisement
Answer
Your attempt has three problems; it uses os.walk
which traverses subdirectories (perhaps this is not a problem because your folder does not have subdirectories, but you should use the correct function for your use case regardless), and you are opening a file in the current directory instead of the one actually returned by os.walk
. Finally, the input from csv.reader
cannot be None
; either the line contains fewer fields (in which case you cannot access the second field at all, and trying will get you an IndexError
), or it contains an empty string. (More fundamentally, your indentation seems to be broken, but since you are not asking about a syntax error, I’m guessing your actual code doesn’t have this problem.)
Here’s a quick refactoring to use glob.glob
instead of os.walk
, assuming that the input CSV files have an empty field where you were looking for None
. (It would obviously not be hard to change it to if len(line) < 2:
if you wanted to, or cover both conditions.)
import csv import os from glob import glob with open("SUMMARY.csv", 'w', encoding='utf-8') as output_file: writer = csv.writer(output_file) writer.writerow(['SCENARIO', 'STATUS']) for filename in glob(f"{path}/*.csv"): with open(filename, 'r', encoding='utf-8') as input_file: value = "FAIL" reader = csv.reader(input_file) for lineno, line in enumerate(reader, 1): if lineno != 2: continue if line[1] != "": value = "PASS" break writer.writerow([os.path.basename(filename).split(".")[0], value])
Tangentially perhaps notice also how I avoid having two variables with almost the same names csvfile
and csv_file
.
The logic writes "FAIL"
if there is only one input line, too. (Refactored in response to a comment.)