Skip to content
Advertisement

Python code to read a csv file and create a master csv file

I have a folder let’s say as ‘input_Folder’ which has a list of CSV files with data. I’m trying to write a Python code which reads this list of CSV files from the input_folder and creates a master CSV file with two columns.

The columns in the master CSV files are ‘Scenario’ and ‘Status’.

Column Name requirement are as follows, Scenario = Name of the file from the directory and Status = if the file has a value in the second row of second column then populate as ‘Pass’ else ‘Fail’

Below is my code. After executing the code I’m able to see Master CSV created but with empty lines. I’m quite new to python so could somebody help me out here please

import os
import csv

path = (Input file path)

with open("SUMMARY.csv", 'w') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(['SCENARIO', 'STATUS'])
    for files in os.walk(path):
        for filename in files:
    with open("input_file.csv") as csv_file: #checking if the code is working for
        # one sample file
        all_rows = list(csv_file)
        line_count = 0
        for row in all_rows[1:2]:
            if line_count == 1:
                if row[1].value == None:
                    writer.writerow([os.path.basename(filename).split(".")[0], 'PASS'])
                else:
                    writer.writerow([os.path.basename(filename).split(".")[0], 'FAIL'])
            line_count += 1

Advertisement

Answer

Your attempt has three problems; it uses os.walk which traverses subdirectories (perhaps this is not a problem because your folder does not have subdirectories, but you should use the correct function for your use case regardless), and you are opening a file in the current directory instead of the one actually returned by os.walk. Finally, the input from csv.reader cannot be None; either the line contains fewer fields (in which case you cannot access the second field at all, and trying will get you an IndexError), or it contains an empty string. (More fundamentally, your indentation seems to be broken, but since you are not asking about a syntax error, I’m guessing your actual code doesn’t have this problem.)

Here’s a quick refactoring to use glob.glob instead of os.walk, assuming that the input CSV files have an empty field where you were looking for None. (It would obviously not be hard to change it to if len(line) < 2: if you wanted to, or cover both conditions.)

import csv
import os
from glob import glob

with open("SUMMARY.csv", 'w', encoding='utf-8') as output_file:
    writer = csv.writer(output_file)
    writer.writerow(['SCENARIO', 'STATUS'])
    for filename in glob(f"{path}/*.csv"):
        with open(filename, 'r', encoding='utf-8') as input_file:
            value = "FAIL"
            reader = csv.reader(input_file)
            for lineno, line in enumerate(reader, 1):
                if lineno != 2:
                    continue
                if line[1] != "":
                    value = "PASS"
                break
        writer.writerow([os.path.basename(filename).split(".")[0], value])

Tangentially perhaps notice also how I avoid having two variables with almost the same names csvfile and csv_file.

The logic writes "FAIL" if there is only one input line, too. (Refactored in response to a comment.)

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement