How can I calculate the differences from neighboured numbers in a dataframe column named 'y'
by only using Pandas commands?
Here is an example where I convert the column 'y'
first to numpy and then use np.diff
.
import numpy as np import pandas as pd np.random.seed(10) df = pd.DataFrame(np.random.randint(0,10,size=(10,2)),columns=['x', 'y']) y=df['y'].values diff_y=np.diff(y) print(np.array([y[0:-1],diff_y]).T) [[ 4 -3] [ 1 -1] [ 0 8] [ 8 -8] [ 0 6] [ 6 -3] [ 3 1] [ 4 4] [ 8 0]]
Advertisement
Answer
You could use diff
to find the differences and shift
to get the differences align (like in your output):
df['diff_y'] = df['y'].diff().shift(-1) print(df[['y', 'diff_y']])
Output:
y diff_y 0 4 -3.0 1 1 -1.0 2 0 8.0 3 8 -8.0 4 0 6.0 5 6 -3.0 6 3 1.0 7 4 4.0 8 8 0.0 9 8 NaN