Python Pandas: Append column value, based on another same column value

Question

I have a pandas dataframe like this. I want to append Town value, which is based on row have the same Source, Level and County value. I have tried isin, groupby, diff(but my value is str), but still not figure out. Image below is what I want to get. Really appreciate your help! Answer The way we can make this

Accepted Answer

The way we can make this work is by creating a list out of it using groupby() and apply(list), we can then transform this into a string separated by comma. Let&#8217;s split it into 2 steps for better understanding.Personally I would keep this data as a list within a pandas series and not do step 2. Formatting as string separated by comma might not be ideal to work with.Step 1:output = df.groupby(['Time','Source','Level','County'])['Town'].apply(list).reset_index()Returns:                      Time Source Level County            Town0  2021-12-01 10:01:41.443    NaN   NaN    NaN           [nan]1  2021-12-01 10:01:46.452    NaN   NaN    NaN           [nan]2                      NaN   Test     3     C1         [C1-T1]3                      NaN   Test    5-     C2  [C2-T0, C2-T1]Now, we can format them correctly into strings (step 2):output['Town'] = pd.Series([', '.join([y for y in x if type(y) == str]) for x in output['Town']]).replace('',np.nan)Which outputs our desired result:                      Time Source Level County          Town0  2021-12-01 10:01:41.443    NaN   NaN    NaN           NaN1  2021-12-01 10:01:46.452    NaN   NaN    NaN           NaN2                      NaN   Test     3     C1         C1-T13                      NaN   Test    5-     C2  C2-T0, C2-T1

Advertisement

Answer