If there are any cells with a comma (if condition), I would like to separate them out and pick the last one, something like:
The original table is like here below:
index | x1 | x2 |
---|---|---|
0 | banana | orange |
1 | grapes, Citrus | apples |
2 | tangerine, tangerine | melons, pears |
which is going to be changed to like below:
index | x1 | x2 |
---|---|---|
0 | banana | orange |
1 | Citrus | apples |
2 | tangerine | pears |
As you can see, for each cell the second fruit name was selected by iterating over all cells in dataframe.
In order to do that, I would like to use apply with a function that separates by comma, but please let me know if there’s a better way to do that.
Thanks.
Advertisement
Answer
You can access that with .str
accessor:
>>> df x1 x2 index 0 banana orange 1 grapes, Citrus apples 2 tangerine, tangerine melons, pears >>> df.apply(lambda col: col.str.split(', ').str[-1], axis=1) x1 x2 index 0 banana orange 1 Citrus apples 2 tangerine pears
Or, in steps:
>>> df['x1'] = df['x1'].str.split(', ').str[-1] >>> df['x2'] = df['x2'].str.split(', ').str[-1]