Suppose a .csv
file which looks like this:
- title: is the name of the column
- and
[senior innovation manager]
is the first row.
Note: both strings (title and row) look exactly as written here.
JavaScript
x
3
1
title
2
[senior innovation manager]
3
The idea is to convert this list string representation to an actual python list:
JavaScript
1
14
14
1
import ast
2
import pandas as pd
3
import numpy as np
4
5
# read the file
6
df = pd.read_csv(file_path, sep=',', na_values='NA', encoding='latin-1')
7
8
# convert first row to actual python list
9
df['title'][0]=ast.literal_eval(df['title'][0])
10
11
# inspect if ast.literal_eval() converted to actual list:
12
print(df['title'][0])
13
print(type(df['title'][0]))
14
However when tried the above code the next error arises:
JavaScript
1
10
10
1
Traceback (most recent call last):
2
File "file_path", line 76, in <module>
3
df['title'][0]=ast.literal_eval(df['title'][0])
4
File "C:UsersidAnaconda3libast.py", line 46, in literal_eval
5
node_or_string = parse(node_or_string, mode='eval')
6
File "C:UsersidAnaconda3libast.py", line 35, in parse
7
return compile(source, filename, mode, PyCF_ONLY_AST)
8
File "<unknown>", line 1
9
[senior innovation manager]
10
What’s the nature of this error?
Is it possible to convert this list string representation to an actual python list?
Advertisement
Answer
I don’t see any advantage to treating this as a CSV file or using pandas. You could simply read the second line of the file and strip the unwanted stuff out. You can do that by grabbing a slice from the second character to one before the end. In python list syntax, that’s 1:-1
.
JavaScript
1
6
1
with open(file_path) as fileobj:
2
# skip title
3
fileobj.readline()
4
# get data
5
title_list = [fileobj.readline().strip()[1:-1]]
6