Skip to content
Advertisement

Pandas Reading csv file with ” in the data

I want to parse CSV file but the data look like in the below. While using separator as ,” it does not distribute file correctly to the columns. Is there any way to ignore ” or escaping with regex?

3,”Gunnar Nielsen Aaby”,”M”,24,NA,NA,”Denmark”,”DEN” 4,”Edgar Lindenau Aabye”,”M”,34,NA,NA,”Denmark/Sweden” 5,”Christine Jacoba Aaftink”,”F”,21,185,82,”Netherlands” 5,”Christine Jacoba Aaftink”,”F”,21,185,82,”Netherlands” 6,”Per Knut Aaland”,”M”,31,188,75,”United States”,”USA”

Thanks ins advance

Advertisement

Answer

Reading the csv file (assuming no new line between the rows):

with open('data') as f:
    raw = f.readline()

Some splitting and processing:

data = []
for r in raw.split('" '):
    data.append((r+'"').split(','))

Creating the final dataframe:

df = pd.DataFrame(data)
df

Output:

enter image description here

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement