Is there a way to get the count of every element in lists stored as rows in a data frame?

Question

Hi, I'm using pandas to display and analyze a csv file, some columns were 'object dtype' and were displayed as lists, I used 'literal_eval' to convert the rows of a column named 'sdgs' to lists, my problem is how to use 'groupby' or any another way to display the count of every element stored at this lists uniquely, especially since

Accepted Answer

Given this example data:import pandas as pddf = pd.DataFrame({'domain': ['a', 'a', 'b', 'c'],                    'sdgs': [['Just', 'a', 'sentence'], ['another', 'sentence'],                             ['a', 'word', 'and', 'a', 'word'], ['nothing', 'here']]})print(df)  domain                     sdgs0      a      [Just, a, sentence]1      a      [another, sentence]2      b  [a, word, and, a, word]3      c          [nothing, here]To get the word count across all the lists in the sdgs column you can concatenate the lists with Series.agg and use collections.Counter:import collectionsword_counts = collections.Counter(df['sdgs'].agg(sum))print(word_counts)Counter({'a': 3, 'sentence': 2, 'word': 2, 'Just': 1, 'another': 1,          'and': 1, 'nothing': 1, 'here': 1})

Advertisement

Answer