Skip to content
Advertisement

Distance Matrix between rows of a Pandas Dataframe with Lat and Lon

I have a Pandas DataFrame with the coordinates of different cell towers where one column is the Latitude and another column is the Longitude like this:

         Tower_Id    Latitude   Longitude    

 0.        a1           x1         y1

 1.        a2           x2         y2

 2.        a3           x3         y3

and so on

I need to get the distances between each cell tower and all the others, and subsequently between each cell tower and its closest neighbouring tower.

I have been trying to recycle some code of the distance between the location of the tower and the expected location of a tower that I got from interpolation (in this case I had 4 different columns, 2 for the coordinates and 2 for the expected coordinates). The code I had used is the following:

def haversine(row):
    lon1 = row['Lon']
    lat1 = row['Lat']
    lon2 = row['Expected_Lon']
    lat2 = row['Expected_Lat']
    lon1, lat1, lon2, lat2 = map(math.radians, [lon1,    lat1, lon2, lat2])
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = math.sin(dlat/2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2)**2
    c = 2 * math.asin(math.sqrt(a)) 
    km = 6367 * c
    return km

I have not been able to now compute the distance matrix of the cell towers in the DataFrame that I have now. Can anybody help me with this one?

Advertisement

Answer

Scipy’s distance_matrix essentially uses broadcast, so here’s a solution

# toy data
lendf = 4
np.random.seed(1)
lats = np.random.uniform(0,180, lendf)
np.random.seed(2)
lons = np.random.uniform(0,360, lendf)
df = pd.DataFrame({'Tower_Id': range(lendf),
                   'Lat': lats,
                   'Lon': lons})
df.head()
#   Tower_Id    Lat         Lon
#0  0           75.063961   156.958165
#1  1           129.658409  9.333443
#2  2           0.020587    197.878492
#3  3           54.419863   156.716061

# x contains lat-lon values
x = df[['Lat','Lon']].values * (np.pi/180.0)

# sine of differences
sine_diff = np.sin((x - x[:,None,:])/2)**2

# cosine of lat
lat_cos = np.cos(x[:,0])

a = sine_diff [:,:,0] + lat_cos * lat_cos[:, None] * sine_diff [:,:,1]
c = 2 * 6373 * np.arcsin(np.sqrt(d))

Output (c):

array([[   0.        , 3116.76244275, 8759.2773379 , 2296.26375266],
       [3116.76244275,    0.        , 5655.63934703, 2239.2455718 ],
       [8759.2773379 , 5655.63934703,    0.        , 7119.00606308],
       [2296.26375266, 2239.2455718 , 7119.00606308,    0.        ]])
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement