Skip to content
Advertisement

Looping through multiple columns in a table in Python

I’m trying to loop through a table that contains covid-19 data. My table has 4 columns: month, day, location, and cases. The values of each column in the table is stored in its own list, so each list has the same length. (Ie. there is a month list, day list, location list, and cases list). There are 12 months, with up to 31 days in a month. Cases are recorded for many locations around the world. I would like to figure out what day of the year had the most total combined global cases. I’m not sure how to structure my loops appropriately. An oversimplified sample version of the table represented by the lists is shown below.

In this small example, the result would be month 1, day 3 with 709 cases (257 + 452).

Month Day Location Cases
1 1 CAN 124
1 1 USA 563
1 2 CAN 242
1 2 USA 156
1 3 CAN 257
1 3 USA 452
. .
12 31

Advertisement

Answer

You group your dataframe by month and day. Then iterate through the groups to find the group in which the sum of cases in all locations was max as shown below:

import pandas as pd
df = pd.DataFrame({'Month':[1,1,1,1,1,1], 'Day':[1,1,2,2,3,3],
                   'Location':['CAN', 'USA', 'CAN', 'USA','CAN', 'USA'],
                   'Cases':[124,563,242,156,257,452]})

grouped = df.groupby(['Month', 'Day'])
max_sum = 0
max_day = None
for idx, group in grouped:
    if group['Cases'].sum() > max_sum:
        max_sum = group['Cases'].sum()
        max_day = group

month = max_day['Month'].iloc[1]
day = max_day['Day'].iloc[1]
print(f'Maximum cases of {max_sum} occurred on {month}/{day}.')

#prints: Maximum cases of 709 occurred on 1/3

If you don’t want to use Pandas, this is how you do it:

months = [1,1,1,1,1,1]
days = [1,1,2,2,3,3]
locations = ['CAN', 'USA', 'CAN', 'USA','CAN', 'USA']
cases = [124,563,242,156,257,452]
dic = {}
target_day = 0
count = 0

for i in range(len(days)):
    if days[i] != target_day:
        target_day = days[i]
        count = cases[i]
    else:
        count += cases[i]
        dic[f'{months[i]}/{days[i]}'] = count

max_cases = max(dic.values())
worst_day = list(dic.keys())[list(dic.values()).index(max_cases)]

print(f'Maximum cases of {max_cases} occurred on {worst_day}.')

#Prints: Maximum cases of 709 occurred on 1/3.
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement