I have a dataset, df
that looks like this:
Date | Code | City | State | Quantity x | Quantity y | Population | Cases | Deaths |
---|---|---|---|---|---|---|---|---|
2019-01 | 10001 | Los Angeles | CA | 445 | 0 | 0 | ||
2019-01 | 10002 | Sacramento | CA | 4450 | 556 | 0 | 0 | |
2020-03 | 12223 | Houston | TX | 440 | 4440 | 35000000 | 23 | 11 |
… | … | … | … | … | … | … | … | … |
2021-07 | 10002 | Sacramento | CA | 3220 | NA | 5444000 | 211 | 22 |
My start and end date are the same for all cities. I have over 4000 different cities, and would like to plot a 2-yaxis graph for each city, using something similar to the following code:
import matplotlib.pyplot as plt fig, ax1 = plt.subplots(figsize=(9,9)) color = 'tab:red' ax1.set_xlabel('Date') ax1.set_ylabel('Quantity X', color=color) ax1.plot(df['Quantity x'], color=color) ax1.tick_params(axis='y', labelcolor=color) ax2 = ax1.twinx() color2 = 'tab:blue' ax2.set_ylabel('Deaths', color=color2) ax2.plot(df['Deaths'], color=color2) ax2.tick_params(axis='y', labelcolor=color2) plt.show()
I would like to create a loop so that the code above runs for every Code
that is related to a City
, with quantity x and deaths, and it saves each graph made into a folder. How can I create a loop that does that, and stops every different Code
?
Observations: Some values on df['Quantity x]
and df[Population]
are left blank.
Advertisement
Answer
If I understood you correctly, you are looking for a filtering functionality:
import matplotlib.pyplot as plt import pandas as pd def plot_quantity_and_death(df): # your code fig, ax1 = plt.subplots(figsize=(9, 9)) color = 'tab:red' ax1.set_xlabel('Date') ax1.set_ylabel('Quantity X', color=color) ax1.plot(df['Quantity x'], color=color) ax1.tick_params(axis='y', labelcolor=color) ax2 = ax1.twinx() color2 = 'tab:blue' ax2.set_ylabel('Deaths', color=color2) ax2.plot(df['Deaths'], color=color2) ax2.tick_params(axis='y', labelcolor=color2) # save & close addon plt.savefig(f"Code_{str(df['Code'].iloc[0])}.png") plt.close() df = pd.DataFrame() # this needs to be replaced by your dataset # get unique city codes, loop over them, filter data and plot it unique_codes = pd.unique(df['Code']) for code in unique_codes: filtered_df = df[df['Code'] == code] plot_quantity_and_death(filtered_df)