I have a program that converts CSV files into pipe delimited files and also counts the total no of lines. But in this case, if the total no of lines is above 7000, I want to create a new output file. The situation would be adjusting just 7000 lines in one file and creating every other output files for every other 7000 lines.
Any suggestions, ideas, or modifications will be highly appreciated.
Previous Code which converts into a single file:
import csv input_file = input("Enter input file") output_file = input("Enter Output file") # count number of lines def total_lines(input_file): with open(input_file) as f: return sum(1 for line in f) # convert input files to output def file_conversion(input_file, output_file): with open(input_file) as fin: with open(output_file, 'w', newline='') as fout: reader = csv.DictReader(fin, delimiter=',') writer = csv.DictWriter(fout, reader.fieldnames, delimiter='|') writer.writeheader() writer.writerows(reader) print("Successfully converted into", output_file)
Advertisement
Answer
more-itertools
makes this easy.
from more_itertools import chunked def file_conversion(input_file, output_file_pattern, chunksize): with open(input_file) as fin: reader = csv.DictReader(fin, delimiter=',') for i, chunk in enumerate(chunked(reader, chunksize)): with open(output_file_pattern.format(i), 'w', newline='') as fout: writer = csv.DictWriter(fout, reader.fieldnames, delimiter='|') writer.writeheader() writer.writerows(chunk) print("Successfully converted into", output_file)
Example usage:
file_conversion('in.csv', 'out{:03}.csv', 7000)
which would generate files out000.csv
, out001.csv
, etc.