I have an excel file that can calculate the last column with a formula. I am trying to replicate this into python code. How can I perform an operation to achieve such results?
Essentially, the last column is supposed to take, for example, row 2 and subtract itself by row 0 (because they share the same industry besides the month, which I am trying to compare the two by).
I tried using a for loop and take the index of row i
then subtract row i
‘s ‘No. of jobs’ by the row i-2
‘s ‘No. of jobs’ and adding that result to the last column. Then, I would increment i
by 1 to let the operation continue for the next row, but I was unsuccessful.
Period Industry No. of jobs Difference from prev. month 0 January Farm 70200 N/A 1 January Mining 4900 N/A 2 February Farm 70100 -100 3 February Mining 4850 -50 4 March Farm 70200 100 5 March Mining 4600 -250 6 April Farm 70300 100 7 April Mining 5200 600 8 May Farm 70300 0 9 May Mining 5300 100
Advertisement
Answer
Try as follows:
df['Difference from prev. month'] = df['No. of jobs'].groupby(df['Industry']).diff() df['Difference from prev. month'] 0 NaN 1 NaN 2 -100.0 3 -50.0 4 100.0 5 -250.0 6 100.0 7 600.0 8 0.0 9 100.0 Name: Difference from prev. month, dtype: float64