I’m trying to fill a empty dataframe (OutputData) in Python with values from another dataframe (InputData).
InputData has four columns (“Strike”, “DTE”, “IV”, “Pred_IV”) OutputData has as an index all unique Strikes from InputData and as Column names all unique DTE from Input Data.
My goal is to fill the OutputData with the corresponding “Pred_IV” values from InputData. As it needs to match both the index and the column name I’m not getting my head around on how to do it with any known function.
If there is no value in InputData which matches both the index and column name the value can remain NaN
Find below the dataframes I use with the df.to_dict() extract for additional detail.
Many thanks for your help.
Best, Flo
InputData.head()
Strike DTE IV Pred_IV 8 0.5131 2.784 0.3366 0.733360 9 0.5131 3.781 0.3291 0.735295 20 0.5864 2.784 0.3178 0.733476 21 0.5864 3.781 0.3129 0.735357 22 0.5864 4.778 0.3008 0.736143
InputData.head().to_dict()
{'Strike': {8: 0.5131, 9: 0.5131, 20: 0.5864, 21: 0.5864, 22: 0.5864}, 'DTE': {8: 2.784, 9: 3.781, 20: 2.784, 21: 3.781, 22: 4.778}, 'IV': {8: 0.33659999999999995, 9: 0.32909999999999995, 20: 0.3178, 21: 0.3129, 22: 0.30079999999999996}, 'Pred_IV': {8: 0.7333602770095773, 9: 0.7352946387206533, 20: 0.7334762408944806, 21: 0.7353567361456718, 22: 0.7361431377881676}})
OutputData.head()
0.025 0.101 0.197 0.274 0.523 0.772 1.769 2.267 2.784 3.781 4.778 5.774 0.5131 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.5864 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.6597 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.7330 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.7697 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
OutputData.head(2).to_dict()
{0.025: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 0.101: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 0.197: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 0.274: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 0.523: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 0.772: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 1.769: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 2.267: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 2.784: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 3.781: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 4.778: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}, 5.774: {0.5131: nan, 0.5864: nan, 0.6597: nan, 0.733: nan, 0.7696999999999999: nan}}
Advertisement
Answer
Here is a way to do what I believe your question is asking:
import pandas as pd import numpy as np InputData = pd.DataFrame( columns='Strike,DTE,IV,Pred_IV'.split(','), index=[8,9,20,21,22], data=[[0.5131, 2.784, 0.3366, 0.733360], [0.5131, 3.781, 0.3291, 0.735295], [0.5864, 2.784, 0.3178, 0.733476], [0.5864, 3.781, 0.3129, 0.735357], [0.5864, 4.778, 0.3008, 0.736143]]) OutputData = pd.DataFrame(data=np.NaN, columns=pd.Index(name='DTE', data=list(set(InputData.DTE.to_list()))), index=pd.Index(name='Strike', data=list(set(InputData.Strike.to_list())))) def foo(x): OutputData.loc[x.Strike, x.DTE] = x.Pred_IV InputData.apply(foo, axis=1) print(OutputData)
Output:
DTE 2.784 3.781 4.778 Strike 0.5131 0.733360 0.735295 NaN 0.5864 0.733476 0.735357 0.736143
If you prefer unnamed indexes, you can do this instead:
OutputData = pd.DataFrame(data=np.NaN, columns=list(set(InputData.DTE.to_list())), index=list(set(InputData.Strike.to_list())))
Output:
2.784 3.781 4.778 0.5131 0.733360 0.735295 NaN 0.5864 0.733476 0.735357 0.736143