Skip to content
Advertisement

Merging time series data so that column values are fitted into dictionaries

I have two time-series data frames that track the same certain countries throughout the same amount of time, but the variables they track for each observation represent vastly different things. For example, the first data frame is like so:

Tracking variable ‘A’:

Country 01/01/2020 01/02/2020 01/03/2020 04/25/2021
AFG 0 0 1 5000
CHN 0 20 50 0
USA 1 2 4 10000

The second data frame tracks a different variable ‘B’ and looks like the following:

Country 01/01/2020 01/02/2020 01/03/2020 04/25/2021
AFG 10 10 0 0
CHN 1 1 1 20
USA 0 7 10 543

I want to merge these two data frames so that the following is produced ({} to represent internal dictionaries):

Country 01/01/2020 01/02/2020 01/03/2020 04/25/2021
AFG (A: 0, B: 10) (A: 0, B: 10) (A: 1, B: 0) (A: 5000, B: 0)
CHN (A: 0, B: 1) (A: 20, B: 1) (A: 50, B: 1) (A: 0, B: 20)
USA (A: 1, B: 0) (A: 2, B: 7) (A: 4, B: 10) (A: 10000, B: 543)

Is there a command I am missing in Pandas or is there some type of way to do this I am missing? Thank you for whatever help you can give me.

Advertisement

Answer

Code

pd.concat([df1, df2]).groupby('Country').agg(list).applymap(
lambda x: dict(zip(['A','B'], [*x]))).reset_index()

Output

    Country 01/01/2020          01/02/2020  01/03/2020  04/25/2021
0   AFG    {'A': 0, 'B': 10}    {'A': 0, 'B': 10}   {'A': 1, 'B': 0}    {'A': 5000, 'B': 0}
1   CHN    {'A': 0, 'B': 1}     {'A': 20, 'B': 1}   {'A': 50, 'B': 1}   {'A': 0, 'B': 20}
2   USA    {'A': 1, 'B': 0}     {'A': 2, 'B': 7}    {'A': 4, 'B': 10}   {'A': 10000, 'B': 543}

Explanation

We are first concatenating both dfs using pd.concat.

Then we are grouping the resultant df by country using groupby.

On the grouped df is aggregated using agg to create a list of items based on Country.

Then we are mapping the list items to create a dict with A and B as keys using applymap

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement