I have the following dataframe:
d1 = [{'AB':['Record=1','POS=2'], 'AVF':[], 'HI':['Record=343', 'POS=4'], 'version_1':[]}, {'AB':[], 'AVF':['Record=4', 'POS=454'], 'AVF':[], 'HI':[], 'version_2':[]}, {'AB':['Record=4', 'POS=32'], 'AVF':[], 'HI':[], 'version_3':[]}] frame = pd.DataFrame(d1) AB AVF HI version_1 version_2 version_3 0 [Record=1, POS=2] [] [Record=343, POS=4] [] NaN NaN 1 [] [] [] NaN [] NaN 2 [Record=4, POS=32][] [] NaN NaN []
Desired output:
AB AVF HI version_1 [Record=1, POS=2] [] [Record=343, POS=4] version_2 [] [] [] version_3 [Record=4, POS=32][] []
I have tried:
frame.set_index(['version1','version2','version3']).stack()
The real dictionary is very large with over 30 versions so simply typing out the version numbers into a list is not an option.
thanks
Advertisement
Answer
Try this:
frame.melt(['AB', 'AVF', 'HI']).dropna().drop('value', axis=1).set_index('variable')
Output:
AB AVF HI variable version_1 [Record=1, POS=2] [] [Record=343, POS=4] version_2 [] [] [] version_3 [Record=4, POS=32] [] []