Skip to content
Advertisement

How find the distance in meters between two points in a dataframe?

I have a dataframe where, columns with subscript 1 are starting points and with 2 are end points. I want to find a difference in kilometers between them. I tried following code however got an error

import mpu
import pandas as pd
import numpy as np

data = {'lat1': [39.92123,  39.93883,  39.93883,  39.91034,  39.91248],
        'lon1': [116.51172, 116.51135, 116.51135, 116.51627, 116.47186],
        'lat2': [np.nan,    39.92123,  39.93883,  39.93883,  39.91034],
        'lon2': [np.nan,   116.51172, 116.51135, 116.51135, 116.51627  ]}  
  
# Create DataFrame  
df = pd.DataFrame(data)  


df['distance'] = mpu.haversine_distance((df.lat1, df.lon1), (df.lat2, df.lon2))

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Advertisement

Answer

Try using .apply() with lambda function so that you can pass the coordinates as scalar values instead of now passing 4 Pandas series to the function:

df['distance'] = df.apply(lambda x: mpu.haversine_distance((x.lat1, x.lon1), (x.lat2, x.lon2)), axis=1)

You can also use list(map(...)) for faster execution, as follows:

df['distance'] = list(map(mpu.haversine_distance, zip(df.lat1, df.lon1), zip(df.lat2, df.lon2)))
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement