Skip to content
Advertisement

How can I split a cell in a pandas dataframe and keep the delimiter in another column?

persons
John New York
Janet New York
Mike Denver
Michelle Texas

I want to split into 2 columns: person and city. I tried this:

df = pd.DataFrame({"persons": ["John New York", "Janet New York", "Mike Denver", "Michelle Texas"]})
df[["name", "city"]] = df.persons.str.split("New York", expand=True,)

and it gives me this:

          persons            name  city
0   John New York           John
1  Janet New York          Janet
2     Mike Denver     Mike Denver  None
3  Michelle Texas  Michelle Texas  None

What I want is to split by cities and keep the separator in the city column like this:

          persons            name  city
0   John New York           John   New York
1  Janet New York          Janet   New York 
2     Mike Denver     Mike Denver  None
3  Michelle Texas  Michelle Texas  None

Advertisement

Answer

You can use regex with a capture group:

df[['name', 'city']] = df['persons'].str.split(r'(New York)', expand=True).iloc[:,:2]

print(df)

          persons            name      city
0   John New York           John   New York
1  Janet New York          Janet   New York
2     Mike Denver     Mike Denver      None
3  Michelle Texas  Michelle Texas      None

Read more on how it works here.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement