Skip to content
Advertisement

Splitting a Column using Pandas

I am trying to split the following column using Pandas: (df name is count)

Location                        count
POINT (-118.05425 34.1341)      355
POINT (-118.244512 34.072581)   337
POINT (-118.265586 34.043271)   284
POINT (-118.360102 34.071338)   269
POINT (-118.40816 33.943626)    241

to this desired outcome:

X-Axis        Y-Axis        count
-118.05425    34.1341       355
-118.244512   34.072581     337
-118.265586   34.043271     284
-118.360102   34.071338     269
-118.40816    33.943626     241

I have tried removing the word ‘POINT’, and both the brackets. But then I am met with an extra white space at the beginning of the column. I tried using:

count.columns = count.columns.str.lstrip()

But it was not removing the white space.

I was hoping to use this code to split the column:

count = pd.DataFrame(count.Location.str.split(' ',1).tolist(), 
                         columns = ['x-axis','y-axis'])

Since the space between both x and y axis could be used as the separator, but the white space.

Advertisement

Answer

You can use .str.extract with regex pattern having capture groups:

df[['x-axis', 'y-axis']] = df.pop('Location').str.extract(r'((S+) (S+))')

print(df)
   count       x-axis     y-axis
0    355   -118.05425    34.1341
1    337  -118.244512  34.072581
2    284  -118.265586  34.043271
3    269  -118.360102  34.071338
4    241   -118.40816  33.943626
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement