Skip to content
Advertisement

problem with csv file to open it with csv.reader

I have the following code:

import pandas as pd
import numpy as np
import csv
filename = (r"C:UsersZAppDataRoamingMicrosoftWindowsStart MenuProgramsAnaconda3 (64- bit)diabetes.csv")
raw_data = open(filename, 'rb')
reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
x = list(reader)
data = (np.array(x)).astype('float')
print(data.shape)

But it errors:

----> 7 x = list(reader)
Error: iterator should return strings, not bytes (did you open the file in text mode?)

When I change 'rb' to 'rt':

raw_data = open(filename, 'rt')

It says:

----> 8 data = (np.array(x)).astype('float')
ValueError: could not convert string to float: 'Pregnancies'

And when I delete .astype('float'), the result is (769, 9) but the expected result is (768, 9).

It counts the header as data. Can you tell me why?

Advertisement

Answer

Before you do following:

reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
x = list(reader)

try

reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
next(reader)
x = list(reader)

which should skip the header of csvfiles.

It is described @ https://docs.python.org/3/library/csv.html#csv.csvreader.__next__

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement