I am trying to split the following column using Pandas: (df name is count)
JavaScript
x
7
1
Location count
2
POINT (-118.05425 34.1341) 355
3
POINT (-118.244512 34.072581) 337
4
POINT (-118.265586 34.043271) 284
5
POINT (-118.360102 34.071338) 269
6
POINT (-118.40816 33.943626) 241
7
to this desired outcome:
JavaScript
1
7
1
X-Axis Y-Axis count
2
-118.05425 34.1341 355
3
-118.244512 34.072581 337
4
-118.265586 34.043271 284
5
-118.360102 34.071338 269
6
-118.40816 33.943626 241
7
I have tried removing the word ‘POINT’, and both the brackets. But then I am met with an extra white space at the beginning of the column. I tried using:
JavaScript
1
2
1
count.columns = count.columns.str.lstrip()
2
But it was not removing the white space.
I was hoping to use this code to split the column:
JavaScript
1
3
1
count = pd.DataFrame(count.Location.str.split(' ',1).tolist(),
2
columns = ['x-axis','y-axis'])
3
Since the space between both x and y axis could be used as the separator, but the white space.
Advertisement
Answer
You can use .str.extract
with regex pattern having capture groups:
JavaScript
1
2
1
df[['x-axis', 'y-axis']] = df.pop('Location').str.extract(r'((S+) (S+))')
2
JavaScript
1
8
1
print(df)
2
count x-axis y-axis
3
0 355 -118.05425 34.1341
4
1 337 -118.244512 34.072581
5
2 284 -118.265586 34.043271
6
3 269 -118.360102 34.071338
7
4 241 -118.40816 33.943626
8