Skip to content
Advertisement

Apply string in list according to beginning of the strings in a pandas dataframe column

Let’s take an example.

I have a list of categories that are identified :

JavaScript

The strings in that list can’t be a substring of another string in that list.

And a dataframe :

JavaScript

I would like to add a column Category to this dataframe. If the string in the column Items starts as a string in L_known_categories, no matter the case of the characters, the category is that string. If no string founded, the category is the string in column Items.

I could use a for loop but it is not efficient with my real big dataframe. How please could I do ?

Expected output :

JavaScript

Advertisement

Answer

You can use regex in pandas.Series.str.extract:

JavaScript
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement