Skip to content
Advertisement

Combine rows and average column if another column is minimum

I have a pandas dataframe:

             Server  Clock 1  Clock 2  Power   diff
0  PhysicalWindows1     3400   3300.0   58.5  100.0
1  PhysicalWindows1     3400   3500.0   63.0  100.0
2  PhysicalWindows1     3400   2900.0   25.0  500.0
3  PhysicalWindows2     3600   3300.0   83.8  300.0
4  PhysicalWindows2     3600   3500.0   65.0  100.0
5  PhysicalWindows2     3600   2900.0   10.0  700.0
6    PhysicalLinux1     2600      NaN    NaN    NaN
7    PhysicalLinux1     2600      NaN    NaN    NaN
8              Test     2700   2700.0   30.0    0.0

Basically, I would like to average the Power for each server but only if the difference is minimum. For example, if you look at the ‘PhysicalWindows1’ server, I have 3 rows, two have a diff of 100, and one has a diff of 500. Since I have two rows with a diff of 100, I would like to average out my Power of 58.5 and 63.0. For ‘PhysicalWindows2’, since there is only one row which has the least diff, we return the power for that one row – 65. If NaN, return Nan, and if there is only one match, return the power for that one match.

My resultant dataframe would look like this:

             Server  Clock 1            Power  
0  PhysicalWindows1     3400    (58.5+63.0)/2
1  PhysicalWindows2     3600             65.0
2    PhysicalLinux1     2600              NaN
3              Test     2700             30.0

Advertisement

Answer

Use groupby with dropna=False to avoid to remove PhysicalLinux1 and sort=True to sort index level (lowest diff on top) then drop_duplicates to keep only one instance of (Server, Clock 1):

out = (df.groupby(['Server', 'Clock 1', 'diff'], dropna=False, sort=True)['Power']
         .mean().droplevel('diff').reset_index().drop_duplicates(['Server', 'Clock 1']))

# Output
             Server  Clock 1  Power
0    PhysicalLinux1     2600    NaN
1  PhysicalWindows1     3400  60.75
3  PhysicalWindows2     3600  65.00
6              Test     2700  30.00
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement