I have the following table:
Country | Starting Post Code | Destination post Code |
---|---|---|
US | 99685 | 65039 |
GB | AB15 | DD9 |
That I am trying to run the following query to return the road miles between the starting and end postcodes, but as I am learning Python I am struggling to get it to pass in the Country from the table below. I can use the commented-out code to pass either ‘GB’ or ‘US’ but I need to have this variable built in from the table.
JavaScript
x
9
1
import pandas as pd
2
import pgeocode
3
df = pd.read_excel("C:\Users\APP DEVPython\distance\Road\Address.xlsx",sheet_name=0)
4
#dist = pgeocode.GeoDistance('GB')
5
dist = pgeocode.GeoDistance(df['Country']).astype(str).to_list()
6
df["Distance"]=dist.query_postal_code(df['Starting Post Code'].astype(str).to_list(),df['Destination post Code'].astype(str).to_list())
7
8
print(df)
9
The issue is with the following line:
JavaScript
1
2
1
dist = pgeocode.GeoDistance(df['Country']).astype(str).to_list()
2
The error traceback I get is as follows:
JavaScript
1
12
12
1
> Traceback (most recent call
2
> last): File "c:UserskyddgorgDesktopAPP
3
> DEVPythondistanceRoadDistance.py", line 6, in <module>
4
> dist = pgeocode.GeoDistance(df['Country']).astype(str).to_list() File
5
> "C:UserskyddgorgAppDataLocalProgramsPythonPython310libsite-packagespgeocode.py",
6
> line 333, in __init__
7
> super().__init__(country) File "C:UserskyddgorgAppDataLocalProgramsPythonPython310libsite-packagespgeocode.py",
8
> line 193, in __init__
9
> country = country.upper() File "C:UserskyddgorgAppDataLocalProgramsPythonPython310libsite-packagespandascoregeneric.py",
10
> line 5575, in __getattr__
11
> return object.__getattribute__(self, name) AttributeError: 'Series' object has no attribute 'upper'
12
Thanks for any help
Advertisement
Answer
I took the liberty of modifying the column labels to unify their names, please make sure they match your data files.
JavaScript
1
23
23
1
import pgeocode
2
import pandas as pd
3
4
5
df = pd.read_excel("C:\Users\APP DEVPython\distance\Road\Address.xlsx",sheet_name=0)
6
7
## This is a dictionary that simulates the data
8
# data = {
9
# "Country": ["US", "GB"],
10
# "Starting post code": ["99685", "AB15"],
11
# "Destination post code": ["65039", "DD9"],
12
# }
13
# df = pd.DataFrame.from_dict(data=data)
14
15
df["Distance"] = df.apply(
16
lambda row: pgeocode.GeoDistance(row["Country"]).query_postal_code(
17
row["Starting post code"], row["Destination post code"]
18
),
19
axis=1,
20
)
21
22
print(df)
23
I must warn you that using apply
is quite inneficient and may scale badly if you have a millions of rows.