I am trying to delete particular columns in csv file.
CSV file:
Name,Age,YearofService,Department,Allocation Birla,49,12,Welding,Production Robin,38,10,Molding,Production
I am trying to delete Entire column having column header “Department” and “Allocation”.
My code :
with open(input.csv,'r') as i: with open(output.csv,'w',new line='') as o: reader=csv.reader(i) writer = csv.writer(o) for row in reader: for i in range(len(row)): if row[i]!="Department" and row[i]!="Allocation": writer.writerow(row)
My output:
Name Birla Robin Age 49 38 YearofService 12 10
Expected output:
Name,Age,YearofService Birla,49,12 Robin,38,10
We cannot gaurantee Department and Allocation will be in column header position “3” and “4”. thats y am using iteration through length of row
Advertisement
Answer
In this case, the csv.DictReader
and csv.DictWriter
classes are very handy:
import csv with open("input.csv") as instream, open("output.csv", "w") as outstream: # Setup the input reader = csv.DictReader(instream) # Setup the output fields output_fields = reader.fieldnames output_fields.remove("Department") output_fields.remove("Allocation") # Setup the output writer = csv.DictWriter( outstream, fieldnames=output_fields, extrasaction="ignore", # Ignore extra dictionary keys/values ) # Write to the output writer.writeheader() writer.writerows(reader)
Notes
For input, each of the row will be a dictionary such as
{'Name': 'Birla', 'Age': '49', 'YearofService': '12', 'Department': 'Welding', 'Allocation': 'Production'}
For output, we remove those columns (fields) that we don’t need, see
output_fields
The
extraaction
parameter tellsDictReader
to ignore extra keys/values from the dictionaries
Update
In order to remove columns from a CSV file we need to
- Open the input file, reader all the rows, close it
- Open it again to write.
Here is the code, which I modified from the above
import csv with open("input.csv") as instream: # Setup the input reader = csv.DictReader(instream) rows = list(reader) # Setup the output fields output_fields = reader.fieldnames output_fields.remove("Department") output_fields.remove("Allocation") with open("input.csv", "w") as outstream: # Setup the output writer = csv.DictWriter( outstream, fieldnames=output_fields, extrasaction="ignore", # Ignore extra dictionary keys/values ) # Write to the output writer.writeheader() writer.writerows(rows)