Here’s some data from another question:
positive negative neutral 1 [marvel, moral, bold, destiny] [] [view, should] 2 [beautiful] [complicated, need] [] 3 [celebrate] [crippling, addiction] [big]
What I would do first is to add quotes across all words, and then:
import ast df = pd.read_clipboard(sep='s{2,}') df = df.applymap(ast.literal_eval)
Is there a smarter way to do this?
Advertisement
Answer
Lists of strings
For basic structures you can use yaml without having to add quotes:
import yaml df = pd.read_clipboard(sep='s{2,}').applymap(yaml.load) type(df.iloc[0, 0]) Out: list
Lists of numeric data
Under certain conditions, you can read your lists as strings and the convert them using literal_eval
(or pd.eval
, if they are simple lists).
For example,
A B 0 [1, 2, 3] 11 1 [4, 5, 6] 12
First, ensure there are at least two spaces between the columns, then copy your data and run the following:
import ast df = pd.read_clipboard(sep=r's{2,}', engine='python') df['A'] = df['A'].map(ast.literal_eval) df A B 0 [1, 2, 3] 11 1 [4, 5, 6] 12 df.dtypes A object B int64 dtype: object
Notes
for multiple columns, use
applymap
in the conversion step:df[['A', 'B', ...]] = df[['A', 'B', ...]].applymap(ast.literal_eval)if your columns can contain NaNs, define a function that can handle them appropriately:
parser = lambda x: x if pd.isna(x) else ast.literal_eval(x) df[['A', 'B', ...]] = df[['A', 'B', ...]].applymap(parser)if your columns contain lists of strings, you will need something like
yaml.load
(requires installation) to parse them instead if you don’t want to manually add quotes to the data. See above.