Skip to content
Advertisement

Remove substring in array of strings column in dataframe

I have dataframe column which is array of strings

````
| fruits                         |
|--------------------------------|
|['fruit=apple', 'fruit=banana'] |
|['fruit=orange', 'fruit=banana']|
|['fruit=apple', 'fruit=orange'] |
|['fruit=orange', 'fruit=orange']|
````

I want to get result like

``
| fruits             |
|--------------------|
|['apple', 'banana'] |
|['orange', 'banana']|
|['apple', 'orange'] |
|['orange', 'orange']|
``

I want to remove substring 'fruit='

Advertisement

Answer

You can apply a function to the column. In this case split each string by = and take the last element of the result.

df['fruits'].apply(lambda x: [f.split('=')[-1] for f in x])
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement