I am trying to split the following column using Pandas: (df name is count)
Location count POINT (-118.05425 34.1341) 355 POINT (-118.244512 34.072581) 337 POINT (-118.265586 34.043271) 284 POINT (-118.360102 34.071338) 269 POINT (-118.40816 33.943626) 241
to this desired outcome:
X-Axis Y-Axis count -118.05425 34.1341 355 -118.244512 34.072581 337 -118.265586 34.043271 284 -118.360102 34.071338 269 -118.40816 33.943626 241
I have tried removing the word ‘POINT’, and both the brackets. But then I am met with an extra white space at the beginning of the column. I tried using:
count.columns = count.columns.str.lstrip()
But it was not removing the white space.
I was hoping to use this code to split the column:
count = pd.DataFrame(count.Location.str.split(' ',1).tolist(), columns = ['x-axis','y-axis'])
Since the space between both x and y axis could be used as the separator, but the white space.
Advertisement
Answer
You can use .str.extract
with regex pattern having capture groups:
df[['x-axis', 'y-axis']] = df.pop('Location').str.extract(r'((S+) (S+))')
print(df) count x-axis y-axis 0 355 -118.05425 34.1341 1 337 -118.244512 34.072581 2 284 -118.265586 34.043271 3 269 -118.360102 34.071338 4 241 -118.40816 33.943626