I have date and time data now I want to reduce this dataframe to two columns with Timestamp (date+time) in a column and value in another column
current df –
JavaScript
x
6
1
Date 8 am 10 am 1 pm
2
-----------------------------------------------
3
23/02/2022 5 10 11
4
24/02/2022 2 17 15
5
25/02/2022 7 90 175
6
desired df –
JavaScript
1
12
12
1
Timestamp value
2
----------------------------------------------
3
2022-02-23 00:08:00 5
4
2022-02-23 00:10:00 10
5
2022-02-23 00:13:00 11
6
2022-02-24 00:08:00 2
7
2022-02-24 00:10:00 17
8
2022-02-24 00:13:00 15
9
2022-02-25 00:08:00 7
10
2022-02-25 00:10:00 90
11
2022-02-25 00:13:00 175
12
Here is original list from which I’m creating my dataframe –
JavaScript
1
11
11
1
[['Date', '08:00', '10:00', '12:00', '14:00', '19:00', '22:00', '03:00'],
2
['23/02/2022', '140', '244', '191', '88', '263', '252', '159'],
3
['24/02/2022', '184', '235', '189', '108', '283', '300', '202'],
4
['25/02/2022', '131', '217', '135', '179', '207', '284', '177'],
5
['26/02/2022', '112', '188', '96', '139', '148', '188', '125'],
6
['27/02/2022', '130', '189', '104', '163', '210', '221', '139'],
7
['28/02/2022', '118', '89', '84', '113', '259', '234', '105'],
8
['01/03/2022', '98', '89', '77', '82', '138', '174', '71'],
9
['02/03/2022', '87', '187', '69', '118', '199', '178', '59'],
10
['03/03/2022', '90', '200', '110', '102', '180', '216', '72']]
11
Advertisement
Answer
Use melt
to flatten your dataframe and set Time
as a name of the variable column. Combine columns Date
and Time
to create the timestamp then sort_values
to reorder your dataframe. Finally, keep only Timestamp
and value
columns:
JavaScript
1
25
25
1
combine_datetime = lambda x: pd.to_datetime(x['Date'] + ' ' + x['Time'],
2
format='%d/%m/%Y %H:%M')
3
4
out = (
5
df.melt('Date', var_name='Time').assign(Timestamp=combine_datetime)
6
.sort_values('Timestamp', ignore_index=True)[['Timestamp', 'value']]
7
)
8
print(out)
9
10
# Output
11
Timestamp value
12
0 2022-02-23 03:00:00 159
13
1 2022-02-23 08:00:00 140
14
2 2022-02-23 10:00:00 244
15
3 2022-02-23 12:00:00 191
16
4 2022-02-23 14:00:00 88
17
..
18
58 2022-03-03 10:00:00 200
19
59 2022-03-03 12:00:00 110
20
60 2022-03-03 14:00:00 102
21
61 2022-03-03 19:00:00 180
22
62 2022-03-03 22:00:00 216
23
24
[63 rows x 2 columns]
25
Note: For pd.to_datetime
, I used an explicit format to avoid Pandas infer the datetime if there are any ambiguities with the day first.