I converted a column to list from a pandas df:
subsectors = df['subsectors'].tolist()
I wanted to separate this kind of strings: ‘BuyMeADrink’ into ‘Buy Me A Drink’
So I used one of the following:
[' '.join(re.findall('[A-Z][^A-Z]*', s)) for s in subsectors]
or
li = re.compile(r'(?<=[a-z])(?=[A-Z])') strings = [li.sub(' ', subsectors) for string in subsectors]
or
output=[] for i in subsectors: output.append(" ".join(re.findall('[A-Z][^A-Z]*', i)))
All of the above returned this:
TypeError: expected string or bytes-like object
I understand that findall() needs strings not list, but here I am iterating over a list that returns strings, why I get this error then?
Thank you.
Advertisement
Answer
Let’s try replace
:
df = pd.DataFrame({'subsectors':['BuyMeADrink' ]}) df['subsectors'].str.replace('([A-Z][a-z]*)',r' 1').str.strip()
Output:
0 Buy Me A Drink Name: subsectors, dtype: object
However, your problem is inherently ambiguous, e.g. how should you split 'ElectionInTheUSA'