Skip to content
Advertisement

Performing calculations on DataFrames of different lengths

I have two different DataFrames that look something like this:

Lat Lon
28.13 -87.62
28.12 -87.65
…… ……
Calculated_Dist_m
34.5
101.7
…………..

The first DataFrame (name=df) (consisting of the Lat and Lon columns) has just over 1000 rows (values) in it. The second DataFrame (name=new_calc_dist) (consisting of the Calculated_Dist_m column) has over 30000 rows (values) in it. I want to determine the new longitude and latitude coordinates using the Lat, Lon, and Calculated_Dist_m columns. Here is the code I’ve tried:

r_earth = 6371000
new_lat = df['Lat'] + (new_calc_dist['Calculated_Dist_m'] / r_earth) * (180/np.pi)
new_lon = df['Lon'] + (new_calc_dist['Calculated_Dist_m'] / r_earth) * (180/np.pi) / np.cos(df['Lat'] * np.pi/180)

When I run the code, however, it only gives me new calculations for certain index values, and gives me NaNs for the rest. I’m not entirely sure how I should go about writing the code so that new longitude and latitude points are calculated for each of over 30000 row values based on the initial 1000 longitude and latitude points. Any suggestions?

EDIT

Here would be some sample outputs. Note that these are not exact figures, but give the idea.

Lat Lon
28.13 -87.62
28.12 -87.65
28.12 -87.63
….. ……
Calculated_Dist_m
34.5
101.7
28.6
30.8
76.5
……………..

And so the sample out put would be:

Lat Lon
28.125 -87.625
28.15 -87.61
28.127 -87.623
28.128 -87.623
28.14 -87.615
28.115 -87.655
28.14 -87.64
28.117 -87.653
28.118 -87.653
28.15 -87.645
28.115 -87.635
28.14 -87.62
28.115 -87.613
28.117 -87.633
28.118 -87.633
…… …….

Again, these are just random outputs (I tried getting the exact calculations, but could not get it to work). But overall, this gives an idea of what would be wanted: taking the coordinates from the first dataframe and calculating new coordinates based on each of the calculated distances from the second dataframe.

Advertisement

Answer

If I understood correctly and assuming df1 and df2 as input, you can perform a cross merge to get all combinations of df1 and df2 rows, then apply your computation (here as new columns Lat2/Lon2):

df = df1.merge(df2, how='cross')
r_earth = 6371000
df['Lat2'] = df['Lat'] + (df['Calculated_Dist_m'] / r_earth) * (180/np.pi)
df['Lon2'] = df['Lon'] + (df['Calculated_Dist_m'] / r_earth) * (180/np.pi) / np.cos(df['Lat'] * np.pi/180)

output:

      Lat    Lon  Calculated_Dist_m       Lat2       Lon2
0   28.13 -87.62               34.5  28.130310 -87.619648
1   28.13 -87.62              101.7  28.130915 -87.618963
2   28.13 -87.62               28.6  28.130257 -87.619708
3   28.13 -87.62               30.8  28.130277 -87.619686
4   28.13 -87.62               76.5  28.130688 -87.619220
5   28.12 -87.65               34.5  28.120310 -87.649648
6   28.12 -87.65              101.7  28.120915 -87.648963
7   28.12 -87.65               28.6  28.120257 -87.649708
8   28.12 -87.65               30.8  28.120277 -87.649686
9   28.12 -87.65               76.5  28.120688 -87.649220
10  28.12 -87.63               34.5  28.120310 -87.629648
11  28.12 -87.63              101.7  28.120915 -87.628963
12  28.12 -87.63               28.6  28.120257 -87.629708
13  28.12 -87.63               30.8  28.120277 -87.629686
14  28.12 -87.63               76.5  28.120688 -87.629220

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement