Skip to content
Advertisement

How to get the average of average of a column of list of lists as string data type?

I have a dataframe with a column like this:

data = [
    '[[0.1, 0.2, 0.3], [0, 0.5]]',
    '[[0.1, 0.2], [0.3, 0.4, 0.5], [0, 0.4]]'
]
df = pd.DataFrame(data, columns=['word_probs'])

It shows the probability of one word in one sentence in one paragraph, the number of words and sentences is random. I would like to get another column average_prob that is the average of the average of each row. so basically 0.225 and 0.25 here.

The data type of column word_probs is string.

How can I achieve this? Thanks a lot in advance!

Advertisement

Answer

We need first convert the string to list with ast , then we do explode

import ast 
df.word_probs.map(ast.literal_eval).explode().map(np.mean).groupby(level=0).mean()
Out[408]: 
0    0.225
1    0.250
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement