Skip to content
Advertisement

Python – trying to Calculate the distance between Starting post code and Destination post code for each entry of my data. Issue with Country [closed]

I have the following table:

Country Starting Post Code Destination post Code
US 99685 65039
GB AB15 DD9

That I am trying to run the following query to return the road miles between the starting and end postcodes, but as I am learning Python I am struggling to get it to pass in the Country from the table below. I can use the commented-out code to pass either ‘GB’ or ‘US’ but I need to have this variable built in from the table.

import pandas as pd
import pgeocode
df = pd.read_excel("C:\Users\APP DEVPython\distance\Road\Address.xlsx",sheet_name=0)
#dist = pgeocode.GeoDistance('GB')
dist = pgeocode.GeoDistance(df['Country']).astype(str).to_list()
df["Distance"]=dist.query_postal_code(df['Starting Post Code'].astype(str).to_list(),df['Destination post Code'].astype(str).to_list())

print(df)

The issue is with the following line:

dist = pgeocode.GeoDistance(df['Country']).astype(str).to_list()

The error traceback I get is as follows:

> Traceback (most recent call
> last):   File "c:UserskyddgorgDesktopAPP
> DEVPythondistanceRoadDistance.py", line 6, in <module>
>     dist = pgeocode.GeoDistance(df['Country']).astype(str).to_list()   File
> "C:UserskyddgorgAppDataLocalProgramsPythonPython310libsite-packagespgeocode.py",
> line 333, in __init__
>     super().__init__(country)   File "C:UserskyddgorgAppDataLocalProgramsPythonPython310libsite-packagespgeocode.py",
> line 193, in __init__
>     country = country.upper()   File "C:UserskyddgorgAppDataLocalProgramsPythonPython310libsite-packagespandascoregeneric.py",
> line 5575, in __getattr__
>     return object.__getattribute__(self, name) AttributeError: 'Series' object has no attribute 'upper'

Thanks for any help

Advertisement

Answer

I took the liberty of modifying the column labels to unify their names, please make sure they match your data files.

import pgeocode
import pandas as pd


df = pd.read_excel("C:\Users\APP DEVPython\distance\Road\Address.xlsx",sheet_name=0)

## This is a dictionary that simulates the data
# data = {
#     "Country": ["US", "GB"],
#     "Starting post code": ["99685", "AB15"],
#     "Destination post code": ["65039", "DD9"],
# }
# df = pd.DataFrame.from_dict(data=data)

df["Distance"] = df.apply(
    lambda row: pgeocode.GeoDistance(row["Country"]).query_postal_code(
        row["Starting post code"], row["Destination post code"]
    ),
    axis=1,
)

print(df)

I must warn you that using apply is quite inneficient and may scale badly if you have a millions of rows.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement