I got csv dataset like this and i need to remove any empty rows inside of it i already tried following code but still it fails when it reads empty row, and return
pandas.errors.ParserError: Error tokenizing data. C error: Expected 7 fields in line 11, saw 8
def feed_db(): try: employees = pd.read_csv( 'employee.csv', delimiter=',', sep='t', encoding="utf-8", header=None, skipinitialspace=True, skip_blank_lines=True) employees.columns = [c.strip().lower().replace(' ', '_') for c in employees.columns] employees.to_sql('employees', conn, if_exists='replace', index=False) except Error as exc: raise Error('Database initialization failed', exc)
sample dataset
Employee Id, Full Name, Gender, Date of Birth, Joined Date, Salary (USD), Branch EN_0001, Aditi Musunur, Male, 1990-03-24, 2011-07-05, 1500, Sri Lanka EN_0002, Advitiya Sujeet, Male, 1986-07-28, 2010-03-24, 1600, Sri Lanka EN_0003, Alagesan Poduri, Male, 1982-05-25, 2016-06-24, 1800, Sri Lanka EN_0004, Amrish Ilyas, Female, 1987-10-24, 2013-12-17, 2000, India EN_0005, Aprativirya Seshan, Female, 1981-12-16, 2012-03-14, 1750, India EN_0006, Asvathama Ponnada, Male, 1986-01-09, 2014-06-18, 2300, Pakistan EN_0007, Avantas Ghosal, Female, 1981-10-05, 2016-08-26, 4200, Pakistan EN_0008, Avidosa Vaisakhi, Male, 1980-08-09, 2018-03-05, 3100, Bangladesh EN_0009, Barsati Sandipa, Male, 1988-04-09, 2011-05-03, 2925, Bangladesh EN_0010, Debasis Sundhararajan, Female, 1990-03-26, 2015-05-18, , 2800, Bangladesh EN_0011, Debas Sundhar, Female, 1990-03-26, 2015-05-18, 2800, Bangladesh
how can i remove those empty rows using pandas ?
Advertisement
Answer
As mentioned, there is a redundant empty cell in line 11. You can open and fix the csv
, write it to StringIO
, then open it with pd.read_csv
:
import pandas as pd import io s = io.StringIO() with open('employee.csv') as file: for line in file: s.write(str(line).replace(", ,", ",")) s.seek(0) df = pd.read_csv(s)
result:
Employee Id | Full Name | Gender | Date of Birth | Joined Date | Salary (USD) | Branch | |
---|---|---|---|---|---|---|---|
0 | EN_0001 | Aditi Musunur | Male | 1990-03-24 | 2011-07-05 | 1500 | Sri Lanka |
1 | EN_0002 | Advitiya Sujeet | Male | 1986-07-28 | 2010-03-24 | 1600 | Sri Lanka |
2 | EN_0003 | Alagesan Poduri | Male | 1982-05-25 | 2016-06-24 | 1800 | Sri Lanka |
3 | EN_0004 | Amrish Ilyas | Female | 1987-10-24 | 2013-12-17 | 2000 | India |
4 | EN_0005 | Aprativirya Seshan | Female | 1981-12-16 | 2012-03-14 | 1750 | India |
5 | EN_0006 | Asvathama Ponnada | Male | 1986-01-09 | 2014-06-18 | 2300 | Pakistan |
6 | EN_0007 | Avantas Ghosal | Female | 1981-10-05 | 2016-08-26 | 4200 | Pakistan |
7 | EN_0008 | Avidosa Vaisakhi | Male | 1980-08-09 | 2018-03-05 | 3100 | Bangladesh |
8 | EN_0009 | Barsati Sandipa | Male | 1988-04-09 | 2011-05-03 | 2925 | Bangladesh |
9 | EN_0010 | Debasis Sundhararajan | Female | 1990-03-26 | 2015-05-18 | 2800 | Bangladesh |
10 | EN_0011 | Debas Sundhar | Female | 1990-03-26 | 2015-05-18 | 2800 | Bangladesh |