I have a dataframe like below and need to create (1) a new dataframe for each unique date and (2) create a new global variable with the date of the new dataframe as the value. This needs to be in a loop.
Using the dataframe below, I need to iterate through 3 new dataframes, one for each date value (202107, 202108, and 202109). This loop occurs within an existing function that then uses the new dataframe and its respective global variable of each iteration in further calculations. For example, the first iteration would yield a new dataframe consisting of the first two rows of the below dataframe and a value for the new global variable of “202107.” What is the most straightforward way of doing this?
Date | Col1 | Col2 |
---|---|---|
202107 | 1.23 | 6.72 |
202107 | 1.56 | 2.54 |
202108 | 1.78 | 7.54 |
202108 | 1.53 | 7.43 |
202108 | 1.58 | 2.54 |
202109 | 1.09 | 2.43 |
202109 | 1.07 | 5.32 |
Advertisement
Answer
Loop over the results of .groupby
:
for _, new_df in df.groupby("Date"): print(new_df) print("-" * 80)
Prints:
Date Col1 Col2 0 202107 1.23 6.72 1 202107 1.56 2.54 -------------------------------------------------------------------------------- Date Col1 Col2 2 202108 1.78 7.54 3 202108 1.53 7.43 4 202108 1.58 2.54 -------------------------------------------------------------------------------- Date Col1 Col2 5 202109 1.09 2.43 6 202109 1.07 5.32 --------------------------------------------------------------------------------
Then you can store new_df
to a list or a dictionary and use it afterwards.