I have the following table:
Country | Starting Post Code | Destination post Code |
---|---|---|
US | 99685 | 65039 |
GB | AB15 | DD9 |
That I am trying to run the following query to return the road miles between the starting and end postcodes, but as I am learning Python I am struggling to get it to pass in the Country from the table below. I can use the commented-out code to pass either ‘GB’ or ‘US’ but I need to have this variable built in from the table.
import pandas as pd import pgeocode df = pd.read_excel("C:\Users\APP DEVPython\distance\Road\Address.xlsx",sheet_name=0) #dist = pgeocode.GeoDistance('GB') dist = pgeocode.GeoDistance(df['Country']).astype(str).to_list() df["Distance"]=dist.query_postal_code(df['Starting Post Code'].astype(str).to_list(),df['Destination post Code'].astype(str).to_list()) print(df)
The issue is with the following line:
dist = pgeocode.GeoDistance(df['Country']).astype(str).to_list()
The error traceback I get is as follows:
> Traceback (most recent call > last): File "c:UserskyddgorgDesktopAPP > DEVPythondistanceRoadDistance.py", line 6, in <module> > dist = pgeocode.GeoDistance(df['Country']).astype(str).to_list() File > "C:UserskyddgorgAppDataLocalProgramsPythonPython310libsite-packagespgeocode.py", > line 333, in __init__ > super().__init__(country) File "C:UserskyddgorgAppDataLocalProgramsPythonPython310libsite-packagespgeocode.py", > line 193, in __init__ > country = country.upper() File "C:UserskyddgorgAppDataLocalProgramsPythonPython310libsite-packagespandascoregeneric.py", > line 5575, in __getattr__ > return object.__getattribute__(self, name) AttributeError: 'Series' object has no attribute 'upper'
Thanks for any help
Advertisement
Answer
I took the liberty of modifying the column labels to unify their names, please make sure they match your data files.
import pgeocode import pandas as pd df = pd.read_excel("C:\Users\APP DEVPython\distance\Road\Address.xlsx",sheet_name=0) ## This is a dictionary that simulates the data # data = { # "Country": ["US", "GB"], # "Starting post code": ["99685", "AB15"], # "Destination post code": ["65039", "DD9"], # } # df = pd.DataFrame.from_dict(data=data) df["Distance"] = df.apply( lambda row: pgeocode.GeoDistance(row["Country"]).query_postal_code( row["Starting post code"], row["Destination post code"] ), axis=1, ) print(df)
I must warn you that using apply
is quite inneficient and may scale badly if you have a millions of rows.