problem with csv file to open it with csv.reader

Tags: , ,



I have the following code:

import pandas as pd
import numpy as np
import csv
filename = (r"C:UsersZAppDataRoamingMicrosoftWindowsStart MenuProgramsAnaconda3 (64- bit)diabetes.csv")
raw_data = open(filename, 'rb')
reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
x = list(reader)
data = (np.array(x)).astype('float')
print(data.shape)

But it errors:

----> 7 x = list(reader)
Error: iterator should return strings, not bytes (did you open the file in text mode?)

When I change 'rb' to 'rt':

raw_data = open(filename, 'rt')

It says:

----> 8 data = (np.array(x)).astype('float')
ValueError: could not convert string to float: 'Pregnancies'

And when I delete .astype('float'), the result is (769, 9) but the expected result is (768, 9).

It counts the header as data. Can you tell me why?

Answer

Before you do following:

reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
x = list(reader)

try

reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
next(reader)
x = list(reader)

which should skip the header of csvfiles.

It is described @ https://docs.python.org/3/library/csv.html#csv.csvreader.__next__



Source: stackoverflow