Skip to content
Advertisement

python dictionary group by , order by and create a new key based on rank

How to apply group by location, order by time and create a new key based on rank in python dictionary

li_input_dict

[
{"name":"Alex","location":"US","time":"2020-05-20 10:36:20"},
{"name":"Bob","location":"India","time":"2017-05-20 12:36:20"},
{"name":"Jon","location":"US","time":"2017-05-20 05:36:20"},
{"name":"Kerry","location":"India","time":"2014-05-20 05:36:20"},
{"name":"Mat","location":"US","time":"2013-01-20 05:36:20"},
{"name":"Sazen","location":"India","time":"2013-01-20 05:36:20"}
]

output out is required a new key name as new_name in the dictionary which need to append a number based on data group by location and order by timestamp

[
    {"name":"Alex","location":"US","time":"2020-05-20 10:36:20","new_name":"Alex_3"},
    {"name":"Bob","location":"India","time":"2017-05-20 12:36:20","new_name":"Bob_3"},
    {"name":"Jon","location":"US","time":"2017-05-20 05:36:20","new_name":"Jon_2"},
    {"name":"Kerry","location":"India","time":"2014-05-20 05:36:20","new_name":"Kerry_2"},
    {"name":"Mat","location":"US","time":"2013-01-20 05:36:20","new_name":"Mat_1"},
    {"name":"Sazen","location":"India","time":"2013-01-20 05:36:20","new_name":"Suzen_1"}
    ]

Edited after Mark comment

I tried to sort by time and order by location by below line but unable to use rank in dictionary ,sorry I am not much expert in python ,can do it in sql.

li_input_dict.sort(key=lambda x:time.mktime(time(strptime(x['time'], '%%Y-%%m-%%d %%H:%%M:%%S')))
li_input_dict = sorted(li_input_dict,key =itemgetter('location'))

I am not looking the solution in pandas

Advertisement

Answer

First make a list of dictionaries sorted by location and time. This will put the groups together and within the groups they will be sorted by time:

l = [
    {"name":"Alex","location":"US","time":"2020-05-20 10:36:20"},
    {"name":"Bob","location":"India","time":"2017-05-20 12:36:20"},
    {"name":"Jon","location":"US","time":"2017-05-20 05:36:20"},
    {"name":"Kerry","location":"India","time":"2014-05-20 05:36:20"},
    {"name":"Mat","location":"US","time":"2013-01-20 05:36:20"},
    {"name":"Sazen","location":"India","time":"2013-01-20 05:36:20"}
]

l_sort = sorted(l, key=lambda d: (d['location'], d['time']))

Now you have a list l_sort that looks like:

[{'name': 'Sazen', 'location': 'India', 'time': '2013-01-20 05:36:20'},
 {'name': 'Kerry', 'location': 'India', 'time': '2014-05-20 05:36:20'},
 {'name': 'Bob', 'location': 'India', 'time': '2017-05-20 12:36:20'},
 {'name': 'Mat', 'location': 'US', 'time': '2013-01-20 05:36:20'},
 {'name': 'Jon', 'location': 'US', 'time': '2017-05-20 05:36:20'},
 {'name': 'Alex', 'location': 'US', 'time': '2020-05-20 10:36:20'}]

Now that everything is in the correct place you can use itertools.groupby from the standard library to make groups based on location, then for each dict in each group update the dictionary:

from itertools import groupby 

# group by location
groups = groupby(l_sort, key=lambda d: d['location'])

# for each location
for k, group in groups:
    # update the dicts with the correct index starting at 1
    for i, d in enumerate(group, 1):
        d['new_name'] = f"{d['name']}_{i}"

This will update the dicts in place, so your original list will now have dicts like:

[{'name': 'Alex','location': 'US','time': '2020-05-20 10:36:20','new_name': 'Alex_3'},
 {'name': 'Bob','location': 'India','time': '2017-05-20 12:36:20','new_name': 'Bob_3'},
 {'name': 'Jon','location': 'US','time': '2017-05-20 05:36:20','new_name': 'Jon_2'},
 {'name': 'Kerry','location': 'India','time': '2014-05-20 05:36:20','new_name': 'Kerry_2'},
 {'name': 'Mat','location': 'US','time': '2013-01-20 05:36:20','new_name': 'Mat_1'},
 {'name': 'Sazen','location': 'India','time': '2013-01-20 05:36:20','new_name': 'Sazen_1'}]

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement