I have a dict of city names, each having an empty list as a value. I am trying to use
df.iterrows()
to append corresponding names to each dict key(city):
for index, row in df.iterrows(): dict[row['city']].append(row['fullname'])
Can somebody explain why the code above appends all possible ‘fullname’ values to each dict’s key instead of appending them to their respective city keys?
I.e. instead of getting the result
{"City1":["Name1","Name2"],"City2":["Name3","Name4"]}
I’m getting
{"City1":["Name1","Name2","Name3","Name4"],"City2":["Name1","Name2","Name3","Name4"]}
Edit: providing a sample of the dataframe:
d = {'fullname': ['Jason', 'Katty', 'Molly', 'Nicky'], 'city': ['Arizona', 'Arizona', 'California', 'California']} df = pd.DataFrame(data=d)
Edit 2: I’m pretty sure that my problem lies in my dict, since I created it in the following way:
cities = [] for i in df['city']: cities.append(i) dict = dict.fromkeys(set(cities), [])
when I call dict, i get the correct output:
{"Arizona":[],"California":[]}
However if I specify a key dict['Arizona']
, i get this:
{"index":[],"columns":[],"data":[]}
Advertisement
Answer
The problem is indeed .fromkeys
– the default value is evaluated once – so all of the keys are “pointing to” the same list.
>>> dict.fromkeys(['one', 'two'], []) {'one': [], 'two': []} >>> d = dict.fromkeys(['one', 'two'], []) >>> d['one'].append('three') >>> d {'one': ['three'], 'two': ['three']}
You’d need a comprehension to create a distinct list for each key.
>>> d = { k: [] for k in ['one', 'two'] } >>> d {'one': [], 'two': []} >>> d['one'].append('three') >>> d {'one': ['three'], 'two': []}
You are also manually implementing a groupby with your code:
>>> df.groupby('city')['fullname'].agg(list) city Arizona [Jason, Katty] California [Molly, Nicky] Name: fullname, dtype: object
If you want a dict:
>>> df.groupby('city')['fullname'].agg(list).to_dict() {'Arizona': ['Jason', 'Katty'], 'California': ['Molly', 'Nicky']}