Skip to content
Advertisement

How to create multiple lists with the sum of different columns?

I have a large data set of machine speeds per simulation. I have a column or which simulation it is as well as columns for the different machines. Now I would like to have different lists of the sum of each machine per simulation so that I can do analysis on this as well as create plots for it.

Here is a MRS:

df = {'sim': {0: 1, 1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 3, 7: 3, 8: 3}, 
      'mach_1': {0: 1, 1: 5, 2: 4, 3: 2, 4: 6, 5: 8, 6: 2, 7: 5, 8: 8}, 
      'mach_2': {0: 4, 1: 5, 2: 2, 3: 3, 4: 6, 5: 5, 6: 4, 7: 5, 8: 2}, 
      'mach_3': {0: 7, 1: 8, 2: 9, 3: 4, 4: 4, 5: 6, 6: 8, 7: 5, 8: 6}}

I know that using works, but this just does it for one machine. Meanwhile I have about 20 and would like to know if there is an easier way (if lists are not the best solution I am also open to all suggestions):

lst_m1 = []
for i in range(4):
    lst_m1.append(df['mach_1'][df['sim'] == i].sum())

Advertisement

Answer

You can filter rows by list, here range and then aggregate sum:

df1 = df[df['sim'].isin(range(4))].groupby('sim', as_index=False)['mach_1'].sum()
print (df1)
   sim  mach_1
0    1      10
1    2      16
2    3      15
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement