persons |
---|
John New York |
Janet New York |
Mike Denver |
Michelle Texas |
I want to split into 2 columns: person and city. I tried this:
JavaScript
x
3
1
df = pd.DataFrame({"persons": ["John New York", "Janet New York", "Mike Denver", "Michelle Texas"]})
2
df[["name", "city"]] = df.persons.str.split("New York", expand=True,)
3
and it gives me this:
JavaScript
1
6
1
persons name city
2
0 John New York John
3
1 Janet New York Janet
4
2 Mike Denver Mike Denver None
5
3 Michelle Texas Michelle Texas None
6
What I want is to split by cities and keep the separator in the city column like this:
JavaScript
1
6
1
persons name city
2
0 John New York John New York
3
1 Janet New York Janet New York
4
2 Mike Denver Mike Denver None
5
3 Michelle Texas Michelle Texas None
6
Advertisement
Answer
You can use regex with a capture group:
JavaScript
1
10
10
1
df[['name', 'city']] = df['persons'].str.split(r'(New York)', expand=True).iloc[:,:2]
2
3
print(df)
4
5
persons name city
6
0 John New York John New York
7
1 Janet New York Janet New York
8
2 Mike Denver Mike Denver None
9
3 Michelle Texas Michelle Texas None
10
Read more on how it works here.