I have a dataframe with a column like this:
data = [ '[[0.1, 0.2, 0.3], [0, 0.5]]', '[[0.1, 0.2], [0.3, 0.4, 0.5], [0, 0.4]]' ] df = pd.DataFrame(data, columns=['word_probs'])
It shows the probability of one word in one sentence in one paragraph, the number of words and sentences is random. I would like to get another column average_prob
that is the average of the average of each row. so basically 0.225 and 0.25 here.
The data type of column word_probs
is string.
How can I achieve this? Thanks a lot in advance!
Advertisement
Answer
We need first convert the string to list with ast
, then we do explode
import ast df.word_probs.map(ast.literal_eval).explode().map(np.mean).groupby(level=0).mean() Out[408]: 0 0.225 1 0.250