Custom Column Selection in Pandas DataFrame.Groupby.Agg’s dictionary

Question

I have a problem in selecting what columns to be inserted in Pandas.DataFrame.Groupby.agg. Here's the code to get and prepare the data. Which results in What I've done so far is that results in: The question is: How do I include other non numeric columns? How do I include other undetermined columns in the dictionary and set the method as

Accepted Answer

1) To determine if a column is numeric, you can use pandas.api.types.is_numeric_dtype2) To find the remaining columns, you can use set(df.columns) minus the columns you used in groupby and those with specific agg functions, for examplefrom pandas.api.types import is_numeric_dtypefields_groupby = ['Day', 'Month']fields_specific = {    'High': [min, 'mean', max],    'Low': [min, 'mean', max],    'Open': 'mean',    'Size': lambda x: x.value_counts().index[0],}fields_other = set(set(stock.columns) - set(fields_groupby) - set(fields_specific))fields_agg_remaining = {col: 'mean' if is_numeric_dtype(stock[col]) else lambda x: x.value_counts().index[1] for col in fields_other}after that, combine the set of fields_specific and fields_agg_remaining to be the agg fields listagg_fields = fields_agg_remainingagg_fields.update(fields_specific)stock.groupby(['Day', 'Month']).agg(agg_fields).round(2)EDIT: You can combine everything to put them inside the dictionary argument, for example:stock.groupby(['Day', 'Month']).agg(    {col:         [min, 'mean', max] if col in ['High', 'Low'] else         'mean' if col in ['Open'] else         lambda x: x.value_counts().index[0] if col in ['Size'] else         'mean' if is_numeric_dtype(stock[col]) else         lambda x: x.value_counts().index[1] for col in set(set(stock.columns) - {'Day', 'Month'})}).round(2)

Advertisement

Answer