I have been trying to write a For loop to store all of the CSV files in a directory into one. The files in that directory were produced by another pandas program I wrote, and I have used “group.to_csv(f’data3/{station}.csv’, index = False, encoding = “utf-8″)” for them to make sure the encoding is utf-8.
The combining code I used is as follows:
import os import pandas as pd master_df = pd.DataFrame() directory_path = '/pandas learning/data3' for file in os.listdir (directory_path): pd.read_csv(f'data3/{file}') if file.endswith('.csv'): master_df = master_df.append(pd.read_csv(f'data3/{file}')) master_df.to_csv("final.csv" )
when I run the program, it gives me a UnicodeDecodeError, and since this code is for about 100 files I can’t go and change the encoding of them 1 by 1.
Advertisement
Answer
Since pandas.DataFrame.append
is deprecated, use pandas.concat
instead :
import os import pandas as pd directory_path = '/pandas learning/data3' data=[] for file in os.listdir(directory_path): if file.endswith('.csv'): temp = pd.read_csv(os.path.join(directory_path, f), enconding='xxxx') data.append(temp) master_df = pd.concat(data) master_df.to_csv("final.csv" )
Note : Make sure to put the encoding that correspond to your .csv
files (eg. utf-8
, ansi
, utf-8-sig
, ..)