Create array of differences in col between two adjacent numbers in an array python/pyspark

Question

I have a column of arrays made of numbers, ie [0,80,160,220], and would like to create a column of arrays of the differences between adjacent terms, ie [80,80,60] Does anyone have an idea how to approach this in Python or PySpark? I'm thinking of something iterative (ith term minus i-1th term starting at second term) but am really stuck how

Accepted Answer

Edit:d=[0,80,160,220]df=pd.DataFrame(d,columns= ['col_list'])df['col_new']=df['col_list'].diff()print(df)#output   col_list  col_new0   0        NaN1   80       80.02   160      80.03   220      60.0Also, if you want to delete the row with NaN you can do:df.dropna(subset = ['col_new'])#output   col_list  col_new1   80       80.02   160      80.03   220      60.0

Advertisement

Answer