How to get prior close when you have all stocks in a single DF?

Question

Sorry for the noob question. I have a bunch of stocks in a sqlite3 database: When I print the df, it gives me the following (where each stock_id refers to a unique stock, e.g APPL): I need to target each unique stock_id individually, and get the prior close. I know if each stock was in its own separate dataframe, I

Accepted Answer

If I understand the question correctly, you just want the close of the previous row within each stock group. You can do that easily:newdf = df.assign(prev_close=df.groupby(level=0)['close'].shift())or, before setting the index to stock_id:newdf = df.assign(prev_close=df.groupby('stock_id')['close'].shift())On your data sample:>>> newdf          id   stock_id date         open    high     low     close    volume   prev_closestock_id                                                                                  1           1    1       2022-08-02    9.83    9.845    9.83    9.830   584772     NaN    1           2    1       2022-08-03    9.84    9.860    9.84    9.820     7711    9.83    4           3    4       2022-08-03   10.38   10.380   10.38   10.380      199     NaN    5          46    5       2022-08-03   34.75   35.200   34.75   35.200     1007     NaN    5          45    5       2022-08-02   34.32   34.550   34.32   34.442     1252   35.20    98          8   98       2022-08-02   28.00   28.095   27.90   28.000     2417     NaN    99         71   99       2022-08-02   88.19   88.940   87.15   88.370  1045596     NaN    99         72   99       2022-08-03   88.34   88.550   87.65   88.410   982710   88.37    100       171  100       2022-08-02  117.58  120.010  117.08  119.270    67795     NaN    100       172  100       2022-08-03  119.80  121.940  120.60  121.440     4237  119.27Side note: try to not use inplace=True all over the place. It will cause you trouble in the long run (see e.g. here).Personally, I would prefer this format:cleandf = df.assign(    date=pd.to_datetime(df['date'])).set_index(['stock_id', 'date']).sort_index()newdf = cleandf.assign(prev_close=cleandf.groupby('stock_id')['close'].shift())>>> newdf                     id   open    high     low     close    volume   prev_closestock_id date                                                                  1        2022-08-02    1    9.83    9.845    9.83    9.830   584772      NaN            2022-08-03    2    9.84    9.860    9.84    9.820     7711    9.830   4        2022-08-03    3   10.38   10.380   10.38   10.380      199      NaN   5        2022-08-02   45   34.32   34.550   34.32   34.442     1252      NaN            2022-08-03   46   34.75   35.200   34.75   35.200     1007   34.442   98       2022-08-02    8   28.00   28.095   27.90   28.000     2417      NaN   99       2022-08-02   71   88.19   88.940   87.15   88.370  1045596      NaN            2022-08-03   72   88.34   88.550   87.65   88.410   982710   88.370   100      2022-08-02  171  117.58  120.010  117.08  119.270    67795      NaN            2022-08-03  172  119.80  121.940  120.60  121.440     4237  119.270

Advertisement

Answer