I have the following dataframe:
id occupations 111 teacher 111 student 222 analyst 333 cook 111 driver 444 lawyer
I create a new column with a list of the all the occupations:
new_df['occupation_list'] = df['id'].map(df.groupby('id')['occupations'].agg(list))
How do I only include teacher
and student
values in occupation_list
?
Advertisement
Answer
You can filter before groupby:
to_map = (df[df['occupations'].isin(['teacher', 'student'])] .groupby('id')['occupations'].agg(list) ) df['occupation_list'] = df['id'].map(to_map)
Output:
id occupations occupation_list 0 111 teacher [teacher, student] 1 111 student [teacher, student] 2 222 analyst NaN 3 333 cook NaN 4 111 driver [teacher, student] 5 444 lawyer NaN