Skip to content
Advertisement

comma seperation for each cell of dataframe pandas

If there are any cells with a comma (if condition), I would like to separate them out and pick the last one, something like:

The original table is like here below:

index x1 x2
0 banana orange
1 grapes, Citrus apples
2 tangerine, tangerine melons, pears

which is going to be changed to like below:

index x1 x2
0 banana orange
1 Citrus apples
2 tangerine pears

As you can see, for each cell the second fruit name was selected by iterating over all cells in dataframe.

In order to do that, I would like to use apply with a function that separates by comma, but please let me know if there’s a better way to do that.

Thanks.

Advertisement

Answer

You can access that with .str accessor:

>>> df
 
                         x1             x2
index                                     
0                    banana         orange
1            grapes, Citrus         apples
2      tangerine, tangerine  melons, pears

>>> df.apply(lambda col: col.str.split(', ').str[-1], axis=1)

              x1      x2
index                   
0         banana  orange
1         Citrus  apples
2      tangerine   pears

Or, in steps:

>>> df['x1'] = df['x1'].str.split(', ').str[-1]
>>> df['x2'] = df['x2'].str.split(', ').str[-1]
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement