Here is a dataframe data_1
.
JavaScript
x
9
1
data_1=pd.DataFrame({'id':['1','1','1','1','1','2','2','2','2','2'],
2
'date':['20220325','20220325','20220325','20220327','20220327','20220705','20220705','20220706','20220706','20220706'],
3
'base':["wt","bmi","p","wt","wt","bmi","bmi","wt","p","bmi"],
4
'value':['75','21','25','76','77','19','18','85','23','19']},
5
)
6
data_1['id'] = pd.to_numeric(data_1['id'], errors='coerce')
7
data_1['date'] = pd.to_numeric(data_1['date'], errors='coerce')
8
data_1['value'] = pd.to_numeric(data_1['value'], errors='coerce')
9
I want to make this data_1
as follows:
JavaScript
1
6
1
data_1=pd.DataFrame({'id':[1,1,1,2,2,2],
2
'date':[20220325,20220327,20220327,20220705,20220705,20220706],
3
'wt':[75,76,77,"","",85],
4
'bmi':[21,"","",19,18,19],
5
'p':[25,"","","","",23]})
6
I tried pivot_table
,but the output is not the same as I expected.
Moreover, I need to save the data_1
as csv file, but the there are no columns id
and date
in the csv file that I made.
Is there any method to change the data_1
as my expected output?
Advertisement
Answer
Because you want the duplicates still in your pivot table, this was an interesting question. My solution is to make a pivot table for the non-duplicates and the duplicates and then concatenate them together.
JavaScript
1
10
10
1
tf = data_1[['id', 'date', 'base']].duplicated()
2
3
df1 = data_1[~tf]
4
df2 = data_1[tf]
5
6
df1 = pd.pivot_table(df1, values='value', index=['id', 'date'], columns='base').reset_index()
7
df2 = pd.pivot_table(df2, values='value', index=['id', 'date'], columns='base').reset_index()
8
9
data_1 = pd.concat([df1, df2]).fillna('')
10
Output:
JavaScript
1
9
1
>>> data_1
2
base id date bmi p wt
3
0 1 20220325 21.0 25.0 75.0
4
1 1 20220327 76.0
5
2 2 20220705 19.0
6
3 2 20220706 19.0 23.0 85.0
7
0 1 20220327 77.0
8
1 2 20220705 18.0
9
Then to a csv file use data_1.to_csv(file_path, index=False)