Using pandas, I open some csv files in a loop and set the index to the cycleID column, except the cycleID column is not unique. See below:
for filename in all_files:
abfdata = pd.read_csv(filename, index_col=None, header=0)
abfdata = abfdata.set_index("cycleID", drop=False)
for index, row in abfdata.iterrows():
print(row['cycleID'], row['mean'])
This prints the 2 columns (cycleID and mean) of the dataframe I am interested in for further computations:
1 1.5020712104685252e-11 1 6.56683605063102e-12 2 1.3993315187144084e-11 2 -8.670502467042485e-13 3 7.0270625256163566e-12 3 9.509995221868016e-12 4 1.2901435995915644e-11 4 9.513106448422182e-12
The objective is to use the rows corresponding to the same cycleID and calculate the difference between the mean column values. So, if there are 8 rows in the table, the final array or list would store 4 values.
I want to make it scalable as well where there can be 3 or more rows with the same cycleIDs. In that case, each cycleID could have 2 or more mean differences.
Update: Instead of creating a new ques about it, I thought I’d add here. I used the diff and groupby approach as mentioned in the solution. It works great but I have this extra need to save one of the mean values (odd row or even row doesn’t matter) in a new column and make that part of the new data frame as well. How do I do that?
Advertisement
Answer
You can use groupby
s2= df.groupby(['cycleID'])['mean'].diff() s2.dropna(inplace=True)
output
1 -8.453876e-12 3 -1.486037e-11 5 2.482933e-12 7 -3.388330e-12 8 3.000000e-12
UPDATE
d = [[1, 1.5020712104685252e-11],
[1, 6.56683605063102e-12],
[2, 1.3993315187144084e-11],
[2, -8.670502467042485e-13],
[3, 7.0270625256163566e-12],
[3, 9.509995221868016e-12],
[4, 1.2901435995915644e-11],
[4, 9.513106448422182e-12]]
df = pd.DataFrame(d, columns=['cycleID', 'mean'])
df2 = df.groupby(['cycleID']).diff().dropna().rename(columns={'mean': 'difference'})
df2['mean'] = df['mean'].iloc[df2.index]
difference mean 1 -8.453876e-12 6.566836e-12 3 -1.486037e-11 -8.670502e-13 5 2.482933e-12 9.509995e-12 7 -3.388330e-12 9.513106e-12