Skip to content
Advertisement

How to create multiple delimited files in python?

I have a program that converts CSV files into pipe delimited files and also counts the total no of lines. But in this case, if the total no of lines is above 7000, I want to create a new output file. The situation would be adjusting just 7000 lines in one file and creating every other output files for every other 7000 lines.

Any suggestions, ideas, or modifications will be highly appreciated.

Previous Code which converts into a single file:

import csv
input_file = input("Enter input file")
output_file = input("Enter Output file")

# count number of lines
def total_lines(input_file):
    with open(input_file) as f:
        return sum(1 for line in f)

# convert input files to output
def file_conversion(input_file, output_file):
    with open(input_file) as fin:
        with open(output_file, 'w', newline='') as fout:
            reader = csv.DictReader(fin, delimiter=',')
            writer = csv.DictWriter(fout, reader.fieldnames, delimiter='|')
            writer.writeheader()
            writer.writerows(reader)
            print("Successfully converted into", output_file)

Advertisement

Answer

more-itertools makes this easy.

from more_itertools import chunked

def file_conversion(input_file, output_file_pattern, chunksize):
    with open(input_file) as fin:
        reader = csv.DictReader(fin, delimiter=',')
        for i, chunk in enumerate(chunked(reader, chunksize)):
            with open(output_file_pattern.format(i), 'w', newline='') as fout:
                writer = csv.DictWriter(fout, reader.fieldnames, delimiter='|')
                writer.writeheader()
                writer.writerows(chunk)
                print("Successfully converted into", output_file)

Example usage:

file_conversion('in.csv', 'out{:03}.csv', 7000)

which would generate files out000.csv, out001.csv, etc.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement