Combine rows and average column if another column is minimum

Question

I have a pandas dataframe: Basically, I would like to average the Power for each server but only if the difference is minimum. For example, if you look at the 'PhysicalWindows1' server, I have 3 rows, two have a diff of 100, and one has a diff of 500. Since I have two rows with a diff of 100, I

Accepted Answer

Use groupby with dropna=False to avoid to remove PhysicalLinux1 and sort=True to sort index level (lowest diff on top) then drop_duplicates to keep only one instance of (Server, Clock 1):out = (df.groupby(['Server', 'Clock 1', 'diff'], dropna=False, sort=True)['Power']         .mean().droplevel('diff').reset_index().drop_duplicates(['Server', 'Clock 1']))# Output             Server  Clock 1  Power0    PhysicalLinux1     2600    NaN1  PhysicalWindows1     3400  60.753  PhysicalWindows2     3600  65.006              Test     2700  30.00

Advertisement

Answer