I have date and time data now I want to reduce this dataframe to two columns with Timestamp (date+time) in a column and value in another column
current df –
Date 8 am 10 am 1 pm ----------------------------------------------- 23/02/2022 5 10 11 24/02/2022 2 17 15 25/02/2022 7 90 175
desired df –
Timestamp value ---------------------------------------------- 2022-02-23 00:08:00 5 2022-02-23 00:10:00 10 2022-02-23 00:13:00 11 2022-02-24 00:08:00 2 2022-02-24 00:10:00 17 2022-02-24 00:13:00 15 2022-02-25 00:08:00 7 2022-02-25 00:10:00 90 2022-02-25 00:13:00 175
Here is original list from which I’m creating my dataframe –
[['Date', '08:00', '10:00', '12:00', '14:00', '19:00', '22:00', '03:00'], ['23/02/2022', '140', '244', '191', '88', '263', '252', '159'], ['24/02/2022', '184', '235', '189', '108', '283', '300', '202'], ['25/02/2022', '131', '217', '135', '179', '207', '284', '177'], ['26/02/2022', '112', '188', '96', '139', '148', '188', '125'], ['27/02/2022', '130', '189', '104', '163', '210', '221', '139'], ['28/02/2022', '118', '89', '84', '113', '259', '234', '105'], ['01/03/2022', '98', '89', '77', '82', '138', '174', '71'], ['02/03/2022', '87', '187', '69', '118', '199', '178', '59'], ['03/03/2022', '90', '200', '110', '102', '180', '216', '72']]
Advertisement
Answer
Use melt
to flatten your dataframe and set Time
as a name of the variable column. Combine columns Date
and Time
to create the timestamp then sort_values
to reorder your dataframe. Finally, keep only Timestamp
and value
columns:
combine_datetime = lambda x: pd.to_datetime(x['Date'] + ' ' + x['Time'], format='%d/%m/%Y %H:%M') out = ( df.melt('Date', var_name='Time').assign(Timestamp=combine_datetime) .sort_values('Timestamp', ignore_index=True)[['Timestamp', 'value']] ) print(out) # Output Timestamp value 0 2022-02-23 03:00:00 159 1 2022-02-23 08:00:00 140 2 2022-02-23 10:00:00 244 3 2022-02-23 12:00:00 191 4 2022-02-23 14:00:00 88 .. ... ... 58 2022-03-03 10:00:00 200 59 2022-03-03 12:00:00 110 60 2022-03-03 14:00:00 102 61 2022-03-03 19:00:00 180 62 2022-03-03 22:00:00 216 [63 rows x 2 columns]
Note: For pd.to_datetime
, I used an explicit format to avoid Pandas infer the datetime if there are any ambiguities with the day first.