Skip to content
Advertisement

Doing elementary analytic geometry in pandas

We have two points in Cartesian space as origins and some other typical points. For typical points we are only interested in their distance from the origins: two numbers. so we want reach from df

d = {'origin1': [(1,0,0), (0,0,0)],
     'origin2': [(2,0,0), (0,0,1)],
     'point1': [(40,0,0), (0,0,20)],
     'point2': [(50,0,0), (0,0,25)],
     'point3': [(60,0,0), (0,0,30)]}

display(pd.DataFrame(data=d, index=[0, 1]))

to df

d = {'origin1': [(1,0,0), (0,0,0)],
     'origin2': [(2,0,0), (0,0,1)],
     'point1': [(39,38), (20,19)],
     'point2': [(49,48), (25,24)],
     'point3': [(59,58), (30,29)]}

display(pd.DataFrame(data=d, index=[0, 1]))

Of course here we chose simple number for simple distance to see the problem. in general case we should use Pythagorean distance formula.

Advertisement

Answer

It’s a solution for arbitrary dimensions of points & number of origins that we may have:

d = {'origin1': [(1,0,0) , (0,0,0 )],
     'origin2': [(2,0,0) , (0,0,1 )],
     'point1' : [(40,0,0), (0,0,20)],
     'point2' : [(50,0,0), (0,0,25)],
     'point3' : [(60,0,0), (0,0,30)]}

df_pnt = pd.DataFrame(data=d, index=[0, 1])
df_pnt

First we define some functions:

import pandas as pd

def distance(p1, p2):
    '''
    Calculate the distance between two given points
    Arguments:
        p1, p2: points
    Returns:
        Distance of points
    '''
    dmn = min(len(p1), len(p2))
    vct = ((p1[i] - p2[i])**2 for i in range(dmn))
    smm = sum(vct)
    dst = smm**.5
    return dst

def distances(df, n):
    '''
    Calculate the distances between points & origins
    Arguments:
        df: dataframe of points including origins
        n : number of origins
    Returns:
        dataframe of distances
    '''
    df_dst = df.iloc[:, :n]
    for column in df.columns[n:]:
        df_dst[column] = df.apply(lambda row: tuple(distance(row[origin], row[column]) for origin in df.columns[:n]), axis=1)
    return df_dst

Now this script gives your desired output:

distances(df_pnt, 2)

I hope it be what you want.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement