Skip to content
Advertisement

pandas dataframe moving certain headers to index

I have the following dataframe:

d1 = [{'AB':['Record=1','POS=2'], 'AVF':[], 'HI':['Record=343', 'POS=4'], 'version_1':[]},
  {'AB':[], 'AVF':['Record=4', 'POS=454'], 'AVF':[], 'HI':[], 'version_2':[]},
  {'AB':['Record=4', 'POS=32'], 'AVF':[], 'HI':[], 'version_3':[]}]

frame = pd.DataFrame(d1)




   AB               AVF  HI                 version_1   version_2   version_3
0 [Record=1, POS=2] []   [Record=343, POS=4]    []           NaN        NaN
1 []                []   []                     NaN          []         NaN
2 [Record=4, POS=32][]   []                     NaN          NaN        []

Desired output:

           AB               AVF  HI                 
version_1 [Record=1, POS=2] []   [Record=343, POS=4]    
version_2 []                []   []                     
version_3 [Record=4, POS=32][]   []       

I have tried:

frame.set_index(['version1','version2','version3']).stack()

The real dictionary is very large with over 30 versions so simply typing out the version numbers into a list is not an option.

thanks

Advertisement

Answer

Try this:

frame.melt(['AB', 'AVF', 'HI']).dropna().drop('value', axis=1).set_index('variable')

Output:

                           AB AVF                   HI
variable                                              
version_1   [Record=1, POS=2]  []  [Record=343, POS=4]
version_2                  []  []                   []
version_3  [Record=4, POS=32]  []                   []
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement