How can I calculate the differences from neighboured numbers in a dataframe column named 'y'
by only using Pandas commands?
Here is an example where I convert the column 'y'
first to numpy and then use np.diff
.
JavaScript
x
23
23
1
import numpy as np
2
import pandas as pd
3
4
np.random.seed(10)
5
6
df = pd.DataFrame(np.random.randint(0,10,size=(10,2)),columns=['x', 'y'])
7
8
y=df['y'].values
9
10
diff_y=np.diff(y)
11
12
print(np.array([y[0:-1],diff_y]).T)
13
14
[[ 4 -3]
15
[ 1 -1]
16
[ 0 8]
17
[ 8 -8]
18
[ 0 6]
19
[ 6 -3]
20
[ 3 1]
21
[ 4 4]
22
[ 8 0]]
23
Advertisement
Answer
You could use diff
to find the differences and shift
to get the differences align (like in your output):
JavaScript
1
3
1
df['diff_y'] = df['y'].diff().shift(-1)
2
print(df[['y', 'diff_y']])
3
Output:
JavaScript
1
12
12
1
y diff_y
2
0 4 -3.0
3
1 1 -1.0
4
2 0 8.0
5
3 8 -8.0
6
4 0 6.0
7
5 6 -3.0
8
6 3 1.0
9
7 4 4.0
10
8 8 0.0
11
9 8 NaN
12