I have the following code:
JavaScript
x
10
10
1
import pandas as pd
2
import numpy as np
3
import csv
4
filename = (r"C:UsersZAppDataRoamingMicrosoftWindowsStart MenuProgramsAnaconda3 (64- bit)diabetes.csv")
5
raw_data = open(filename, 'rb')
6
reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
7
x = list(reader)
8
data = (np.array(x)).astype('float')
9
print(data.shape)
10
But it errors:
JavaScript
1
3
1
----> 7 x = list(reader)
2
Error: iterator should return strings, not bytes (did you open the file in text mode?)
3
When I change 'rb'
to 'rt'
:
JavaScript
1
2
1
raw_data = open(filename, 'rt')
2
It says:
JavaScript
1
3
1
----> 8 data = (np.array(x)).astype('float')
2
ValueError: could not convert string to float: 'Pregnancies'
3
And when I delete .astype('float')
, the result is (769, 9)
but the expected result is (768, 9)
.
It counts the header as data. Can you tell me why?
Advertisement
Answer
Before you do following:
JavaScript
1
3
1
reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
2
x = list(reader)
3
try
JavaScript
1
4
1
reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE)
2
next(reader)
3
x = list(reader)
4
which should skip the header of csvfiles.
It is described @ https://docs.python.org/3/library/csv.html#csv.csvreader.__next__