I am working on a data frame that looks like this :
lat lon id_zone 0 40.0795 4.338600 1 45.9990 4.829600 2 45.2729 2.882000 3 45.7336 4.850478 4 45.6981 5.043200
I’m trying to make a Haverisne distance matrix. Basically for each zone, I would like to calculate the distance between it and all the others in the dataframe. So there should be only 0s on the diagonal. Here is the Haversine function that I use but I can’t make my matrix.
def haversine(x): x.lon, x.lat, x.lon2, x.lat2 = map(radians, [x.lon, x.lat, x.lon2, x.lat2]) # formule de Haversine dlon = x.lon2 - x.lon dlat = x.lat2 - x.lat a = sin(dlat / 2) ** 2 + cos(x.lat) * cos(x.lat2) * sin(dlon / 2) ** 2 c = 2 * atan2(sqrt(a), sqrt(1 - a)) km = 6367 * c return km
Advertisement
Answer
You can use the solution to this answer Pandas – Creating Difference Matrix from Data Frame
Or in your specific case, where you have a DataFrame like this example:
lat lon id_zone 0 40.0795 4.338600 1 45.9990 4.829600 2 45.2729 2.882000 3 45.7336 4.850478 4 45.6981 5.043200
And your function is defined as:
def haversine(first, second): # convert decimal degrees to radians lat, lon, lat2, lon2 = map(np.radians, [first[0], first[1], second[0], second[1]]) # haversine formula dlon = lon2 - lon dlat = lat2 - lat a = np.sin(dlat/2)**2 + np.cos(lat) * np.cos(lat2) * np.sin(dlon/2)**2 c = 2 * np.arcsin(np.sqrt(a)) r = 6371 # Radius of earth in kilometers. Use 3956 for miles return c * r
Where you pass the lat
and lon
of the first
location and the second
location.
You can then create a distance matrix using Numpy and then replace the zeros with the distance results from the haversine function:
# create a matrix for the distances between each pair of zones distances = np.zeros((len(df), len(df))) for i in range(len(df)): for j in range(len(df)): distances[i, j] = haversine(df.iloc[i], df.iloc[j]) pd.DataFrame(distances, index=df.index, columns=df.index)
Your output should be similar to this:
id_zone 0 1 2 3 4 id_zone 0 0.000000 659.422944 589.599339 630.083979 627.383858 1 659.422944 0.000000 171.597296 29.555376 37.325316 2 589.599339 171.597296 0.000000 161.731366 174.983855 3 630.083979 29.555376 161.731366 0.000000 15.474533 4 627.383858 37.325316 174.983855 15.474533 0.000000