I want to parse CSV file but the data look like in the below. While using separator as ,” it does not distribute file correctly to the columns. Is there any way to ignore ” or escaping with regex?
3,”Gunnar Nielsen Aaby”,”M”,24,NA,NA,”Denmark”,”DEN” 4,”Edgar Lindenau Aabye”,”M”,34,NA,NA,”Denmark/Sweden” 5,”Christine Jacoba Aaftink”,”F”,21,185,82,”Netherlands” 5,”Christine Jacoba Aaftink”,”F”,21,185,82,”Netherlands” 6,”Per Knut Aaland”,”M”,31,188,75,”United States”,”USA”
Thanks ins advance
Advertisement
Answer
Reading the csv file (assuming no new line between the rows):
with open('data') as f: raw = f.readline()
Some splitting and processing:
data = [] for r in raw.split('" '): data.append((r+'"').split(','))
Creating the final dataframe:
df = pd.DataFrame(data) df
Output: