Skip to content
Advertisement

Find all occurrences of substring in a vector and save results to another column

I have a data frame look like this:

data = {'ID':['DFSADFEFDSAE','FDSADFDSEFDSAFEFDSADFE','ESADFDSADFSADFSA']}
data = pd.DataFrame(data)

I want to find all 'E' in each string and save the indices into another column. I was trying with re.finditer and map to convert list to a string and save for each row but no luck yet. What would be a good approach?

Advertisement

Answer

This will work:

import re
data['new']=[[m.start() for m in re.finditer('E', i)] for i in data['ID']]
print(data)

Output:

                       ID          new
0            DFSADFEFDSAE      [6, 11]
1  FDSADFDSEFDSAFEFDSADFE  [8, 14, 21]
2        ESADFDSADFSADFSA          [0]
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement